DOCUMENT RESUME 



ED 280 872 



TM 870 202 



TITLE 



INSTITUTION 

PUB DATE 
NOTE 

AVAILABLE FROM 



PUB TYPE 

EDRS PRICE 
DESCRIPTORS 



IDENTIFIERS 



Plans and Activities for 1990 Decennial Census. Part 
2: Hearings before the Subcommittee on Census and 
Population of the Committee on Post Office and Civil 
Service. House of Representat i ves , Ninety-Ninth 
Congress, Second Session (May 1, 15, 1986). 
Congress of the U. S. , Washington, D. C. House 
Committee on Post Office and Civil Service. 
May 86 

214p.; Serial No. 99-48. Some sections contain small 
pr int . 

Superintendent of Documents , Congressional Sales 

Of f ice f U.S. Government Pr int ing Of f ice , Washington f 

DC 20402. 

Legal/Legislative/Regulatory Materials (090 ) 
MF01/PC09 Plus Postage. 

*Census Figures; *Data Collection; *Data Processing; 
*Demography ; Federal Programs ; Hear ing s ; Measurement 
Objectives; *National Surveys; Population Trends; 
Quest ionnai res ; *Re search Des ign ; Test 
Construct ion 

Bureau of the Census; *Census 1990; Congress 99th 



ABSTRACT 

Hearings on the 1990 Decennial Census were held on 
May 1, 1986, and May 15, 1986. The May 1 session focused on data 
processing procedures. Speakers included John G. Keane, Daniel G. 
Horvitz, William Eddy f Judith S. Rowe f Benjamin F. King, and Stephen 
E, Fienberg. Topics included automation of address files and 
questionnaire check-in; dissemination of data on a variety of media 
including microfiche and CD-ROM; cost effectiveness of proposed 
procedures; linking of household records; and use of microcomputers 
for data processing. Two areas were mentioned in which the 
Subcommittee could best assist the Bureau of the Census: the 
procurement of computer equipment and the oversight of plans for 
adjustment of censal coun*->. The second hearing, held on May 15 f 
concerned the census qur ^ionnai re and automation. Topics included 
the design of a shorter census questionnaire form; data conversion 
methods such as Film Optical Sensing Device for Input to Computer 
(FOSDIC) and optical mark recognition (OMR); cost effectiveness; and 
the National Content Test. Speakers included Susan Miskura f Gene 
Dodaro f and Gail Franke. (GDC) 



-*r -sfc- -afc- -afc- -afc- -sfc- -5fc- -afc- -affi -afc- -afc- -afc- -afc- -sfc- -sfc- -afc- -sfc- -sfc- -afc- -sfc- -sfc- -sfc- -sfc- -sfc- -sfc- -afc- -sfc- -sfc- -sfc- -sfc- -sfc- -sfc- -sfc- -sfc- -sfc- -sfc- -sfc- -sfc- -sfc- -sfc- -afc- -sfc- -sfc- -sfc- -sfc- -afc- -sfc- -afc- -sfc- -sfc- -afc- -sfc- -afc- -sfc- 

* Reproductions supplied by EDRS are the best that can be made * 

* from the original document. * 
*********************************** 



CM 

oo 
o 

CO 
Oj 

UJ 



PLANS AND ACTIVITIES FOR 1990 

DECENNIAL CENSUS 

SCOPE OF INTEREST NOTICE 

The ERIC Facility has assigned 
this document for processing 
to: 



In our judgment, this document 
is also of interest to the Clear- 
inghouses noted to the right. 
Indexing should reflect their 
special points of view. 



PART 2 

HEARINGS 

BEFORE THE 

SUBCOMMITTEE ON 
CENSUS AND POPULATION 

. OF THE 

COMljflTTEE ON 
POST OFFICE AND CIVIL SERVICE 
HOUSE OF REPRESENTATIVES 

NINETY-NINTH CONGRESS 
SECOND SESSION 



77>1 



MAY 1 AND 15, 1986 



Serial No. 99-48 



Printed for the use of the Committee on Post Office and Civil Service 



U.S. DEPARTMENT OF EDUCATION 

OHice of Educational Research and Improvement 

EDUCATIONAL RESOURCES INFORMATION 
CENTER (ERIC) 

^PCThis document has been reproduced as 
received from the person or organization 
originating it. 
□ Minor changes have been made to improve 
reproduction quality. 




Points of view or opinions stated in this doc u* 
ment do not necessarily represent official 
OERl position or policy. 



r 



61-902 0 



U.S. GOVERNMENT PRINTING OFFICE 
WASHINGTON '. 1986 



For sale by the Superintendent of Documents, Congressional Sales Office 
U.S. Government Printing Office, Washington, DC 20402 



9 



COMMITTEE ON POST OFFICE AND CIVIL SERVICE 
WILLIAM D. FORD, Michigan, Chairman 



FRANK McCLOSKEY, Indiana 
GARY L ACKERMAN, New York 
MERVYN M. DYMALLY, California 
RON de LUGO, Virgin Islands 
MORRIS K. UDALL Arizona 



Tom DeYulia, Staff Director 
Robert E. Lockhart, General Counsel 
Patricia F. Rissler, Deputy Staff Director and Chief Clerk 
Joseph A. Fisher, Minority Staff Director 



Subcommittee on Census and Population 
ROBERT GARCIA, New York, Chairman 



Lillian Fernandez, Subcommittee Staff Director 



WILLIAM (BILL) CLAY, Missouri 
PATRICIA SCHROEDER, Colorado 
STEPHEN J. SOLARZ, New York 
ROBERT GARCIA, New York 
MICKEY LELAND, Texas 
GUS YATRON, Pennsylvania 
MARY ROSE OAKAR, Ohio 
GERRY SIKORSKI, Minnesota 



GENE TAYLOR, Missouri 
BENJAMIN A. GILMAN, New York 
CHARLES PASHAYAN, Jr., California 
FRANK HORTON, New York 
JOHN T MYERS, Indiana 
DON YOUNG, Alaska 
JAMES V. HANSEN, Utah 
DAN BURTON, Indiana 



MARY ROSE OAKAR, Ohio 
GARY L ACKERMAN, New York 



JAMES V. HANSEN, Utah 
JOHN T MYERS, Indiana 



(II) 



3 



CONTENTS 



May 1, 1986 

processing procedures for the 19:90 decennial census 



Statement of: Page 
John G. Keane, Director, Bureau of the Census, accompanied 03- Peter A. 

Bounpane, Assistant Director for Demographic Censuses 2 

Daniel G. Horvitz, executive vice president, Research Triangle Institute.... 36 

William Eddy, department of statistics, Carnegie Mellon University 56 

Judith S. Rowe, associate director for research services, Princeton Uni- 
versity Computing Center 69 

Benjamin F. King, director, survey methods, Educational Testing Service - 86 
Stephen E. Fionberg, Maurice Faulk Professor of Statistics and Social 

Science, Carnegie Mellon University 98 

May 15, 1986 

census questionnaire and automation 

Statement of: 

Susan Miskura, Chief of Decennial Planning Division, U.S. Bureau of the 
Census 112 

Gene Dodaro, Associate Director, General Government Division, U.S. 
General Accounting Office 150 

Gail Franke, vice president, Federal Government marketing, iTational 
Computer Systems 183 



(in) 



4 



PROCESSING PROCEDURES FOR 1990 
DECENNIAL CENSUS 



THURSDAY, MAY 1, 1986 

House of Representatives, 
Subcommittee on Census and Population, 
Committee on Post Office and Civil Service, 

Washington, DC. 

The subcommittee met, pursuant to call, at 10:12 a.m., in room 
311, Cannon House Office Building, Hon. Mary Rose Oakar presid- 
ing. 

Ms. Oakar. Good morning and welcome to the Subcommittee on 
Census and Population hearing on the processing procedures for 
the 1990 decennial census. Unfortunately, Congressman Garcia, the 
chairman of this subcommittee, which I am proud to be a member 
of, must address some urgent matters in his district and regrets he 
is not able to be here. He doen apologize, but I have offered to fill 
in, although no one really can, for Congressman Garcia and I will 
be chairing at least part of the hearing and hopefully Congressman 
Ackerman will take over. So we are delighted to be here. 

I would like to submit Congressman Garcia's statement for the 
record. 

[The statement of Robert Garcia follows:] 

Opening Statement of Hon. Robert Garcia 

Good morning and welcome to our hearing on processing the 1990 Decennial 
Census. We have decided to focus our hearing today on processing the census be- 
cause it is critically important to the success of the decennial census. 

The 1990 Census will be our nation's bicentennial census, and since 1790 the 
census has assisted our nation in attaining our democratic goals. The success of the 
census is central to the progress of our people and our economy. Because processing 
the census is linked to the success of the census, it is important to make sure Jthat 
the Bureau utilizes the best automation technology available. In times of fiscal con- 
straints, it is more than ever important to find ways to run our government pro- 
grams more efficiently, including the decennial census. 

In April of 1985 we held the first hearing during this Congress on the 1990 Decen- 
nial. At that hearing the General Accounting Office raised concerns that the Census 
Bureau is not making timely decisions regarding automating the census. Similar 
concerns were expressed at another hearing by the Inspector General of the Com- 
merce Department. Now that it is a year later, I am interested in finding out the 
progress that has been made in the Census Bureau's decision making on processing 
the census. 

Today, we have invited experts from the academic and private sectors to react to 
the Census Bureau's achievements and plans. These experts have been selected with 
assistance from the Committee on National Statistics of the National Academy of 
Sciences. 

We will first hear from the Census Bureau on the status of their plans and activi- 
ties in four major areas of processing: (1) address list compilation; (2) design of auto- 
mated check-in procedure; (3) editing and processing the census forms; (4) and proc- 

(1) 



5 



easing plans for data dissemination. Then we will hear evaluations of the plans from 
the experts in the four areas. 

Ms. Oakar. The hearing was called to give the subcommittee an 
opportunity to learn the status of the Census Bureau's plans for 
processing the 1990 census. The Bureau provided the subcommittee 
with detailed reports on its plans, consisting largely of preliminary 
and internal documents, and these were referred to a panel of ex- 
perts suggested by the National Academy of Sciences' Committee 
on National Statistics. The experts will also testify at the hearing. 

The chairman believed that the subcommittee needed to be reas- 
sured that the Bureau formulated a clear plan of action for this 
vital part of the census. Critics are concerned that the Bureau is 
not considering a wide enough range of alternatives or settling upon 
a coherent management plan, and staff have been monitoring the 
process at the Census Bureau and keeping members appraised. The 
Bureau held planning meetings in October, but it is not clear how 
fully implemented those plans are. Due to the long lead times re- 
quired to set up census procedures, the chairman was concerned 
that the Bureau not lose the momentum that it had. By holding 
the hearing it was hoped that the subcommittee could provide an 
occasion for the Bureau to explain how it will process the census 
by relying on outside, nonpartisan experts for the detailed reviews. 

The subcommittee hopes to focus attention on the need for cur- 
rent action and away from partisan considerations. The cost of 
travel is not being reimbursed by the Government for the experts 
who have agreed to appear at their own expense, and we are very 
grateful for that. We are very delighted to have the Director of the 
Census Bureau here, Dr. John G. Keane. He will be accompanied 
by his Assistant Director, and we are very happy to have you, 
Doctor. Thank you very much for being here and, again, I know 
that my colleagues and Chairman Garcia welcome you and we look 
forward to your testimony. 

STATEMENT OF JOHN G. KEANE, DIRECTOR, BUREAU OF THE 
CENSUS, ACCOMPANIED BY PETER A. BOUNPANE, ASSISTANT 
DIRECTOR FOR DEMOGRAPHIC CENSUSES 

Mr. Keane. Good morning, Madam Chair, and thank you. We ap- 
preciate the opportunity to brief the subcommittee on our plans for 
the 1990 census. My full statement for the record has been submit- 
ted, so my oral comments will take the form of an overview. 

As the subcommittee requested, I will discuss four crucial topics 
in planning the 1990 census. Those four are: one, concurrent proc- 
essing; two, address list compilation; three, automated address con- 
trol file and automated check-in; and, four, data products and their 
dissemination. 

Going to one, concurrent processing, and by that just so we will 
all be clear on it, we mean questionnaire data conversion that 
occurs concurrently with questionnaire collection. We want to 
begin automated data processing of the 1990 census 5 to 7 months 
earlier than for the 1980 census. In 1980 the conversion of the data 
to machine readable form did not begin until all 409 district offices 
closed and shipped all questionnaires to our then three processing 
centers. So in 1^80 it was a sequential process as opposed to a con- 



current process which we are proposing and planning for 1990. The 
advantage of a concurrent process, some of the main ones are 
these: It will allow more time for review and correction of the data; 
it will enable the computer to assist in certain operations; and it 
wiL give us an earlier start that will help us meet our goal of dis- 
seminating data products in a more timely fashion. 

The issues involved really boil down to two. Planning for the con- 
current processing of the 1990 censys has posed these two major 
questions: Where it would be done and how it would be done? The 
where issue involves the number of processing offices and the 
degree of centralization or decentralization. The how issue involves 
the technology we will use to convert questionnaire data into com- 
puter readable form. 

In addition to these two major questions, we have to answer nu- 
merous related questions. As to the planning, we have been in- 
volved in extensive review of these issues over the last several 
months. For example, we held a major conference to discuss them 
in October 1985. That was referred to as the Decennial Census De- 
cision Conference, and the subcommittee staff was represented at 
that conference. 

Our discussions at this conference made clear what further infor- 
mation we would need to make a final decision; and since then the 
staff have prepared what we call action plans to analyze the bene- 
fits and the risks of various approaches identified. Senior staff com- 
pleted the review of the first wave of these action plans earlier this 
week. Based on this review we have made some decisions on con- 
current processing issues. At the conference in October we defined 
the basic structure of processing and collection offices. The action 
plans then analyzed this structure and found that several of its fea- 
tures were problematic. For example, the key entry workload 
would have required more offices, equipment, and staff than we be- 
lieve we can manage. Developing two primary data conversion sys- 
tems, that is, the key entry and FOSDIC, would be a strain on re- 
sources. 

The required transfer of information between the processing and 
collection offices, especially for hard to enumerate areas, threat- 
ened delays in the start of the nonresponse followup. So we have 
reached decisions that will solve these problems. We have reduced 
workloads in certain key entry operations such as surname and 
full name keying. No data keying is planned. This will allow us to 
keep the number of processing offices down to a manageable 
number, down to about 12. We have agreed to use FOSDIC as the 
primary data conversion system for the 48 contiguous States We 
have not yet reached decisions for Alaska, Hawaii, Puerto Rico, 
and the outlying areas. 

Finally, we have agreed to configure the operations differently in 
areas that are hard to enumerate as opposed to the rest of the 
country. For the hard to enumerate areas, respondents will mail 
questionnaires back to the processing offices. There they will be 
automatically checked in, converted to computer readable form and 
computer editing. Clerks will prepare edits or route them for tele- 
phone followup. Questionnaires that need personal visit followup 
will be sent to the collection offices. The collection offices will con- 
duct nonresponse and failed edit followup. For the rest of the coun- 



7 



try respondents will mail responses to the collection offices, and 
they will be automatically checked in. There will be a clerical tele- 
phone edit and personal visit followup as necessary. 

As each questionnaire passes edit, the collection office will ship it 
on on a flow basis to the processing office for data conversion. The 
lowup ° f course ' be res P° ns ible for nonresponse fol- 

We believe these decisions represent the best balance of staffing 
equipment, and workload considerations between the processing 
and collection offices. We avoid large staffing requirements in the 
processing offices by doing questionnaire check-in and edit for most 
of the country in the collection offices. For hard to enumerate 
areas, we will have the benefit of the automated edit. Also for hard 
to enumerate areas, we will not need large numbers of clerical 
workers, We can, therefore, concentrate on hiring followup enu- 
merators. These decisions allow us to meet our goal of concurrent 
processing for the entire country. 

Now that we have defined this basic office configuration, we will 
continue to work on the detailed plans for implementation over the 
next several months. We will make the final decisions by December 
oi this year. 

Two, address list compilation. We reported to the subcommittee 
cm this topic at a March 13 hearing, so I will be very brief now. In 
1984 we conducted an address list compilation test. The purpose of 
it was to test different ways to compile address lists. Based on the 
results of this test, we have determined our methods for compiling 
lists for both urban and rural areas. For urban areas we will use 

£ m ?o e <E? k y endo Z ¥ ts as the initial source. The test showed that 
the 1980 list could be a viable alternative or supplement to the 
vendor lists where the latter do not exist or where they are of sus- 
pect quality. 

Currently we are developing criteria to evaluate, select, and aug- 
vendor l ls t; For rural areas we will, again, have census 
enumerators create the initial list from scratch. We call this func- 
tion Iree listing. For both urban and rural areas, we will conduct 
update operations to improve the list. The U.S. Postal Service will 
assist us in some of these updates. 

In our test census we are refining our procedures. These efforts 
are described in some detail in my written testimony. Because of 
this extensive testing and fine tunning, we are confident that our 
m ^u nS JH for 19 . 90 wlU be 38 Sood or better than those for 1980. 

the third area is automated address control file and automated 
questionnaire check-in, and these are so interrelated we present 
them as one area. Now I shall discuss how we are going to use an 
automated address file to provide greater control over the census 
process and replace some large-scale clerical operations. In the 
1980 census we did not have an automated address control file. 
Changes to the address registers resulting from district office oper- 
ations were entered m pencil. Questionnaire check-in was done 
inanually and involved many clerks and much time. Clerks also 
^o, 40 P re P,a r e separate address registers for nonresponse followup. 

Ihe development of an automated address control file will be one 
pi the major automation advances since the 1980 census. We can 
key in changes to the lists and thus automatically update the file 



8 



We can put bar code labels on the questionnaires and automatical- 
ly check them in using laser sorters or wands. We can quickly iden- 
tify addresses for which questionnaires have not been returned. 
The computer can print out lists of addresses for use by the follow- 
up enumerators. If we determine that it is cost-effective, we could 
send reminder notices to nonresponding addresses automatically, 
as we tested in Tampa. We are continuing to evaluate the cost-ef- 
fectiveness of reminder notices. 

Now to the final area, data products. The final topic I shall dis- 
cuss is data products. It is fitting that this is our closing topic in a 
way. Making data available to policy makers and the public truly 
is the keystone of a successful census. We have been consulting 
with a broad array of data users in formulating our plans for the 
1990 census data products: the National Conference of State Legis- 
latures; special Census Bureau planning conferences; local public 
meetings, 65, at least one in a State; State data centers and State 
government agencies; regional meetings with American Indians; 
community meetings with Hispanic, black, Asian and Pacific Is- 
lander groups; other Federal agencies and, of course, this subcom- 
mittee staff. 

During April and May of this year we are conducting a series of 
10 product planning meetings around the country. In the fall of the 
year we will hold a conference to present our final report on the 
results of those meetings. At this time we are proposing that com- 
puter summary tapes, printed reports and microfiche will be the 
primary means for distributing 1990 census data, as they were for 
1980. 

We are also evaluating other dissemination media for making 
data available. These media include flexible diskettes for microcom- 
puters, laser disks and on-line data systems. We are considering a 
number of changes in the format, the length and the sequencing of 
1990 data products. These are intended to help us meet our goal of 
providing quality products with a minimum of delay. 

We are discussing these proposals with data users at our product 
planning meetings. Our schedule calls for preparing the final pro- 
gram descriptions for both 100 percent and sample products by the 
fall of 1986. We plan to prepare the detailed specifications for most 
of these products during 1987. 

In conclusion, each of the four issues I have discussed today raise 
many related complex issues, but wo have made much progress in 
resolving outstanding issues and in making important changes 
since 1980. We are building a consensus plan, I might add a coher- 
ent plan in terms of our opening reference, that will lead to success 
in our collection procedures, in automated census processes and in 
the timely dissemination of useful data products. 

Now I and my colleagues look forward to comments and sugges- 
tions from other witnesses assembled today. 

[The statement of John G. Keane and his response to written 
questions follow:] 



9 



6 



STATEMENT OF THE DIRECTOR OF THE lUJRFAH OF THE CENSUS 



John (i, Keann 



llefore thi? Suhcommi ttoo nn Census and Population 
Prist Office and Civil Service Coinm1Lt.ni' 
U.S. House of Representatives 
May 1 , 19R6 



Introduction 



Mr. Chairman, thank you for this opportunity to brief the Subcommittee 
on plans for the 1990 census. As you requested, I wi 11 discuss four 
topics that are crucial areas in planning the 1990 census. The topics 
are: (l) address list compilation, {?.) automated address control file 
and automated check-in, (3) concurrent processing, and U) data products 
and their dissemination. 

We are at a very important point in planning the 1990 census. We 
must make a number of key decisions in the next few months that will 
largely determine how the 1990 census will be taken and processed. We 
welcome this opportunity to discuss our plans with you and look forward 
to the comments and advice of Subcommittee members and the expert witnesses 
who will be testifying today. 



Mr. Chairman, the first topic I will talk about is address list 
compilation, which we discussed at the March 13 hearing before this 
Subcommi ttee . 

Address lists that are as complete and accurate as possible are 
essential if we are going to conduct a good census using the mail-out/ 
mail-back method. We use address lists to control the enumeration by 



Address List Compilation 




7 



2 

milling quest innna i res to each housing unit on t h i * lists o nd monitoring 
the mall returns to determine whether a quest ionnai r<« has been returned 
for a particular unit. Once a housing unit is included in our address 
lists, we stand an (excellent chanco of completing the enumeration of 
that unit and its inhabitants. 

Address lists are not our only concern in taking a good census. 
We must also devise procedures to assurp complete within-household 
coverage, to handle the enumeration of group quarters population, and 
to count those persons who have no usual living quarters. Rut, by 
compiling good address lists we go a long way toward having a successful 
census . 

Since the mid-1960 1 s , we have conducted several test censuses in 
which we examined address list compilation procedures, and we havp 
compiled addresses for the last two decennial censuses. This experience 
has shown us that no single source of addresst; is complete enough to 
meet our stringent requirements. Also, address lists become out-dated 
quickly. Therefore, we perform several updates to verify the accuracy 
and completeness of the census mailing lists. 

In 1980, for urban mail-census areas, we first purchased address 
lists from commercial vendors. These lists were relatively inexpensive 
(5780,000 for 48.6 million addresses). They generally provided good 
coverage, but to improve them, we subjected the lists to four update 
operations before using them to mail questionnaires. The U.S. Postal 
Service (USPS) conducted three of these checks and our own enumerators 
conducted one . 



8 



Wo called the first USPS chock the "advance post office check" 
b i * c lit j s p it was conducted about 10 nmif.hs prior r.n Census Ho y (May-June 
1979). The second check, conducted In early March 19N0 , was call ad 
thn "casing" check, because ma 1 1 carriers snrted census questionnaires 
Into their cases hut did not deliver then. We called thn third DSPS 
check the "time of delivery" chock because it was conducted on 
March 28, 19H0, when nail carriers delivered the census questionnaires. 
During each of these three checks, the USPS reported addresses missing 
from our lists to us, deleted addresses that wore nonoxi stent or wo *e 
businesses, and made corrections to addresses. 

We called the update operation conducted by our own personnel 
"precanvass ." In February 1980, our enumerators systematically canvassed 
assigned areas, updated the address lists, and verified the geographic 
locations of addresses. 

For those mail -census areas where commercial lists were not 
available (generally, the more rural areas of the country), we hired 
enumerators to list and identify the geographic locations of addresses 
from scratch. We called thi s' operation , which took place from March 
to October 1979, "prelist." We also subjected prelist addresses to 
the casing and ti "-of-del i very updates by the USPS. In addition, for 
selected areas, v added a canvassing operation after Census Hay to 
identify any missed units. 

As a result of all these overlapping checks, we believe the 
address lists were as complete as reasonably possible by the time we 
mailed the questionnaires. We continued to check on the completeness 
of our housing inventory In later census operations. 




9 



4 



Given the importance of address lists in taking the census, it is 
appropriate that our first special test for the 1990 census was the 1984 
Address List Compilation Test (ALCT). Even though we believe th.it the 
methods used to prepare the 1980 mailing list were successful , we wanted 
to" test alternative address list sources to try to improve the accuracy 
and lower the costs of our methodology. A 1982 report by the General 
Accounting Office had suggested that we investigate the use of a mailing 
list developed by the USPS. 

We designed the 1984 ALCT to evaluate several ways to develop address 
lists for both urban and rural mail areas. The overall evaluation compared 
the relative cost and the relative accuracy of various address list 
compilation sources as augmented by various update methods. 

In the urban test sites (Bridgeport and Hartford, Connecticut) we 
compared three initial list sources: (1) the USPS, (2) a commercial 
vendor, and (3) the final list of addresses from the 1980 census. We 
updated each of these lists using our "precanvass" procedure and a USPS 
casing check. 

Now, I will discuss the results of the ALCT for the urban areas. 
From the standpoint of coverage, the ALCT results do not rule out any of 
the list compilation techniques. However, the 1980 and USPS lists were 
more expensive than the vendor lists. Also, changing to a USPS-created 
list on a national scale would pose significant planning, control, and 
operational risks. Finally, the relative success of the 1980 list approach 
must be tempered by the fact that there had not been much change in the 
test areas since 1930. It is not reasonable to assume this approach 
would do as well in 1990 for high growth areas. 




10 



5 

Based on these findings, we have decried r.o use vendor lists as the 
primary address list source in urban areas in 1990. The test also showed 
that the 1980 lists are an acceptable alternative or supplement r.o vendor 
lists in areas where no vendor has a list it the vendor list is of suspect 
quality. Thus, we will consider selective use of the 1980 lists. We are 
developing criteria for evaluating, selecting, and augmenting the vendor 
lists we receive in response to our request tor proposals. 

For the rural sites in the ALCT (Hardin County, Texas and Gordon and 
Murray Counties, Georgia) we tested two initial list sources: (]) the 
USPS, and (2) a Census Bureau prelist operation. We also used the same 
two update techniques as in the urban areas: a IJSPS casing check and a 
precanvass by census enumerators. 

Again, from the standpoint of coverage, the ALCT results do not rule 
out the USPS list. There also do not appear to be any significant differ- 
ences in cost between the two methodol ngies . However, the USPS had 
problems assigning correct geographic codes to addresses and marking 
housing unit locations on census maps {which is essential for field 
followup purposes). Furthermore, there are the same risks mentioned 
above for urban areas if we were to convert to the use of USPS-devel oped 
lists in rural areas on a national scale. 

Based on these factors, we have decided to use the prelist methodology 
to create the initial address list for rural areas in 1990. 

Despite the decision not to use the USPS as a source of our initial 
address lists for the 1990 census, the USPS still will play a crucial 
role in the 1990 census. The USPS will conduct the r.hree coverage checks 




11 



6 



of our address lists, deliver and collect thn questionnaires, and, as I 
will discuss below, sort the returned questionnaires for us. 

As a result of the 1984 ALCT, we have determined our approaches for 
acquiring initial lists for the 1990 census, I will close this discussion 
of address list compilation by reviewing briefly some of the additional 
tests we are conducting that relate to address list methodology. 

In the 1986 test census in East Central Mississippi, we are refining 
our prelist procedures. We have incorporated an additional USPS check of 
the prelist addresses--an advance post office check. In 1980, this postal 
check was only done for the vendor lists in urban areas. We also are 
looking at rural delivery methodology. Following the prelist and advance 
post office check, the Mississippi test site was split into two panels. 
In one panel, census enumerators conducted a precanvass operation followed 
by USPS delivery of questionnaires. In the other panel, census enumerators 
delivered questionnaires and updated the address lists at the same time. 
This new update/leave operation is being tested because in the past there 
have been difficulties with our ability to obtain adequate mailing 
addresses for certain rural areas of the country. We will compare the 
coverage and cost between these two methods, the operational feasibility 
of each, and problems in coordinating two types of delivery. 

In our 1985 test censuses in Jersey City, New Jersey and Tampa, 
Florida, and in our 1986 test census in Central Los Angeles County we 
tested and evaluated refinements to the precanvass procedures. First, we 
scheduled precanvass about 2 months earlier than in 1980. Scheduling 
precanvass earlier enables us to incorporate changes so as to have a more 
accurate and complete housing unit inventory for review during the USPS 




12 



7 



casing ana time-of-del i very checks. Second, we had the enumerators do a 
unit-by-unit canvass in each muUiunit building. In 1980, the enumerator 
only verified the number of units in multiunit structures from the manager 
and did not actually canvass within the building unless the reported 
number exceeded our counts. Preliminary results from the 1985 test in 
Jersey City show that 9 percent of the apartment designations were updated 
and improved as a result of this new procedure. This should help to 
improve the accuracy of delivery and followup operations, particularly in 
those hard-to-enumerate areas where there are many multiunit structures. 

This is just a sampling of the 1985 and 1986 tests related to our 
mailing lists. We will continue to refine our procedures in the 1987 test 
census. To sum up: At this point, we have determined the sources for 
initial lists (one of our major mi lestones in 1990 census planning) and 
are determining the most cost-effective set of update operations to those 
lists. We expect our mailing lists for 1990 to be as good, or better, 
than those for 1980. 



I will now discuss how we are going to use an automated address file 
to provide greater control over the census process and to replace some 
large-scale clerical operations (such as questionnaire check-in) with 
automated processes. 

The development of an automated address control file is one of the 
major automation advances since the 1980 census. A description of the 



Automated Address Control File 



And 



Automated Questionnaire Check-In 




13 



8 



1 abor- i ntensi ve 19R0 census operations will be helpfi:l in understanding 
how an automated address control file will improve census procedures in 
1990. 

In 1980 , although the initial file of addresses was computerized, 
the district offices received only paper lists of the addrpsses--one or 
more address "registers" containing the address for each housing unit in 
the enumeration district (ED). Clerks made manual changes to the address 
registers as a result of operations such as precanvass and the USPS 
casing and time-of-del i very checks. The changes included adding, deleting, 
and correcting addresses, as well as moving them from one ED or census 
block to another. These update operations were labor-intensive and, as 
with any large-scale clerical operation, subject to error. 

We sorted and checked-in returned questionnaires in a manual fashion. 
When householders returned their questionnaires by mail in 1980 (about 
70 million did so), our clerks in the district offices first sorted the 
forms by hand to the ED level and then placed them in serial number order 
within ED. Then the clerks matched the questionnaires, one at a time, to 
the address register for the appropriate ED. When they found the serial 
number and corresponding address in the book, they recorded the check-in 
date and other pertinent data. This operation required dozens of clerks 
in each of the 409 district offices. Many district offices took longer 
than the 2 weeks allotted, delaying the start of followup operations. 

For followup of nonrespondents in 1980, we gave the address registers 
to the enumerators and kept a copy in the district office. (The addresses 
were printed on two-part paper and two books were created by separating 
the pages and reassembling them.) The address books contained all the 




14 



9 

original conputer-pri nted addresses and all hand-entered changes, and 
contained every address--both for those households that responded to the 
census end those for which we received no response. 

With an automated address file, we can key in changes and, thus, 
automatically update the file. We can use bar-code technology for computer 
check-in of the questionnaires. As a result, it will be easier to identify 
quickly the addresses for which questionnaires have not been returned. 
If we determine it is cost-effective, we could send reminder notices to 
those addresses automatically, possibly reducing further the number of 
nonresponding housing units to which we need to send enumerators. It 
also should be noted that concurrent processing of census data, which I 
will discuss next, would not be feasible without an automated address 
control ,<r ile and automated questionnaire check-in. 

On the other hand, there are risks involved in automating the address 
control file. Most importantly, having an automated address control file 
to support field operations will increase our requirements for computer 
hardware and skilled personnel to operate that hardware in the processing 
centers at the peak of operations. 

In our 1985 test censuses in Jersey City, New Jersey, and Tampa, 
Florida, we successfully implemented an automated address control file, 
automated check-in, and the use of reminder cards. 

We are building on our 1985 experiences in the test censuses this 
year. Census Day was March 16 for our 1986 test censuses in Central Los 
Angeles County, California, and East Central Mississippi. In Los Angeles 
County, we have a collection office in Bell but we are processing the 
census data at a separate site in Laguna Niguel . Householders mailed 



V 



15 



10 



their questionnaires directly to the processing office in Lagura Nigusl . 
The USPS sorted the questionnaires for us by form typs (short or lo g). 
An additional sort was performed in the processing center for single- and 
multiunit addresses. (The questionnaires formultiunit addresses went to 
a keying operation where the householders' surnames were keyed in. This 
is necessary for nonresponse followup.) 

We h?.d irnprinted each questionnaire with a bar code that contains a 
unique identification number. We ran the returned questionnaires through 
a multiple-pocket laser sorter that performed the necessary sorts not 
already done by the USPS and simultaneously read each questionnaire's bar 
code, recording the identification number onto a computer tape at'.ached 
to the sorting equipment. We used this tape to update the address control 
file, and the addresses for which questionnaires were returned were noted 
on the address control file along with the date of check-in. For various 
reasons, the laser sorter could not read all bar codes. In these cases, 
clerks used hand-held wand readers to record the data, and if the wands 
also were unable, to pick ud the codes, we relied on keying as a back up. 

Using this automated system, we could determine from the address 
control file whether a questionnaire had been checked in for a particular 
address. We generated lists of nonresponse addresses and mailed reminder 
cards to them. On April 3, 18 days after Census Day, we generated the 
li*5ts of nonresponse addresses to be followed up by enumerators. The 
followup lists contained only addresses for which questionnaires were not 
received. For multiunit addresses, a_lj[ addresses were listed (along with 
the names of the householders and other information for responding units) 
so the enumerators could resolve possible apartment mixups. Units for 




16 



n 



which questionnaires had b^en returned are not being visited, except when 
necessary to clear up apartment mixups. Apartment mixups occur in build- 
ings without clear unit designations jr when the mail carrier gives the 
questionnaire designated for Apt. 1 to Apt. 2, etc. Finding ways to 
solve apartment mixups is an important part of our plans to improve the 
census in hard-to-enumerate areas. 

In Mississippi, where we combined the collection and processing 
offices, the principle of automated check-in was much the same as in the 
urban test rite (i.e., uoe of bar-code technology), but there were some 
differences. The USPS did not perform any questionnaire sorts for us, 
nor were laser sorters used. All sorts were done manually by processing 
office staff. We used pen-shaped wands connected tc micro computers to 
check-in the questionnaires. Clerks moved a wand over the bar codes to 
read the identification numbers. They keyed in the numbers if they could 
not be read by wand . 

One of the major decisions we must make with regard to check-in is 
which technology to use in which type of processing office. This issue 
is intimately tied to the larger issue of office configuration (number 
and type of processing offices). It is estimated that one laser sorter 
can process about 11 times as many questionnaires as one wand station 
(10,000 per hour vs. 900) but the laser sorter could cost about 50 times 
more ($250,000 vs. $5,000). The laser sorters would require more 
maintenance per machine and more skilled personnel. On the other hand, 
using wands more extensively means more wand stations and production 
clerks would be required. We also must consider potential use of the 




17 



12 



equipment after the 1990 census. (Micro computers used in wanding would 
have more use after 1990 than laser sorters.) 

Another issue we have examined is the extent, of USPS involvement in 
questionnaire sorting. We already have reached an agreement with the USPS 
under which they will sort questionnaires for us by district office, form 
type, and address type (single- or multiunit) . 

Final decisions on the issues related to the automated address control 
file and automated check-in are scheduled to be made by September 1986. 
In the meantime, working groups are preparing analyses that examine the 
various options, and we must evaluate our experiences in the test censuses 
in Los Angeles County and Mississippi, where questionnaire check-in 
recently occurred . 



Now, I will turn to our third topic, concurrent processing. The 
essence of concurrent processing is that we want to begin the conversion 
of questionnaire data to machine-readable form concurrently with the 
questionnaire collection operation. 

In 1980, while we conducted some basic questionnaire processing in 
our temporary district offices, the conversion of data to machine-readable 
form did not begin until after the district offices completed all enumera- 
tion, edits, and followups and shipped all questionnaires to one of three 
automated processing centers, This was a sequential process. An earlier 
start in 1990 (5-7 months ahead of the 1980 schedule) will allow more 
time for review and correction and will enable the computer to assist in 
certain census operations. It also will contribute to the early 



Concurrent Processing 




18 



13 



identification of enumeration problems. Also, by converting questionnai 
data to machine-readable form socner, we can minimize the potential for 
losing data when original questionnaires are accidentally damaged or 
destroyed. Finally, and perhaps most importantly, it will help us meet 
our goal of disseminating data products in a more timely fashion. 

One of the operations which could be done by computer is editing 
of the questionnaires. In 1980, certain basic edits for completeness 
and coverage were done clerically in the field offices; then, later, 
the computer completed other edits (such as for consistency of the 
data). We have been examining whether we can automate the edits done 
clerically in 1980. Computer edits would be more accurate than clerical 
edits. Staffing might be reduced somewhat, although we would still 
need clerks to control and review the questionnaires that fail edit to 
see if they can be clerically "repaired." We also need clerks to make 
telephone calls or personal visits to those respondents whose question- 
naires cannot be repaired. While automated editing has some important 
advantages it has drawbacks as well. Perhaps the most critical are the 
large number of review and telephone clerks that would be needed in the 
processing centers, and the need to convert the data a second time to 
computer-readable form and assure that it replaces the first data. We 
are weighing these pros and cons to see if we want to do automated 
editing as part of concurrent processing. 

Planning for concurrent processing in the 1990 census has posed 
two major questions: Where and how would it be done? The "where" 
issue involves the number of processing offices and the degree of 
centralization or decentralization. In 1980, when we processed the 




19 



census questionnaires sequentially, we had three processing centers. With 
concurrent processing having so few centers probably would not be feasible, 
primarily because of the need to move materials quickly between processing 
and collection offices. Greater centralization of processing activities 
also places greater staffing burdens on us, i.e., the need to hire more 
employees in one employment area. We must weigh these concerns against 
problems related to decentralization, such as the need for more hardware 
to service a greater number of locations and the difficulties of 
controlling and supporting many processing offices. 

The "how" issue involves the technology we will use to convert 
questionnaire data into a computer-readable format. In the 19t0 census, 
we used the FOSDIC technology to do this. FOSDIC stands for Film Optical 
Sensing Device for Input to Computer. The complete data-conversion system 
consists of high-speed cameras that film the questionnaires, film 
developers to process the raw film into rolls of microfilm, and the FOSDIC 
machines that read the data from microfilm onto computer tape. We call the 
system FACT, which stands for FOSDIC and Automated Camera Technology. This 
system worked very well in the 1980 census, and also was used in the 1960 
and 1970 censuses. We are currently evaluating the use of the fact system 
for the 1990 census, as well as, considering an alternate primary data 
conversion technol ogy--data keying. Even if keying is not used as a 
primary data conversion technology, we will use it for entering some of the 
handwritten data on the questionnaires into computer-readable form. 

In addition to the two major questions of where and how to do 
concurrent processing, we have to answer numerous other related questions. 
We need to make our major decisions on processing methodology by September 




20 



15 

1986 and begin procuring required equipment by early 1987 (some procurements 
wf 1 1 begi n sooner) . 

As a start to answering all of these questions we conducted the 
Decennial Census Decision Conference in October 1985. 

At this conference, we decided that we should implement automated 
processing earlier for the 1990 census than for 1980— so that it occurs 
concurrently with field operations. We also agreed on a general office 
configuration— not as a final decision, but as a framework to help us focus 
our future planning efforts. Under the proposed configuration, we would 
locate large "host" processing offices in metropolitan areas to serve 
several district offices. In more rural areas of the country, we would 
combine the district and processing offices. We would conduct both 
automated processing activities and field follow-up from the same office. 

We also discussed extensively the issue of data conversion technology 
since it is intertwined with the issue of office configuration, but a 
decision on this issue was not a goal of the conference. 

As we discussed these issues at the conference, we found that we did 
not have all the information we needed to reach final decisions on the 
issues related to concurrent processing. Aftor the Decennial Census 
Decision Conference, we established working groups to analyze these 
issues and to prepare "action plans". Each action plan is designed to 
examine and answer specific questions about the data processing systems 
necessary to support the 1990 census and to analyze the benefits and 
risks of various approaches in terms of office structure, timing, staffing, 
costs, management and coordination, quality, technical support, etc. 



24 



21 



16 



The action plans are being completed i„ eh- 
t , . U ompieted in threo waves. We have provided 

th. Subcommittee with a c.plete list of action pl ans and th e three key 
P.- ,„ the fi rst wave. ■ want to e mp hasi 2 e that these .ports re p resent 
interna, Census Bureau staff analysis an(J reconmendatiofis ; ^ ^ 

trough,, reviewed by senior Census Bureau ma na g e me nt nor the Oepart.ent 
. of Coerce before r el ease to the Subcommittee and do not represent 
official Census Bureau positions. 

The set of issues we are addressing to make concurrent processing 
-eam y are ver ycOTplex . the action plans, including the other 

. plans and later plans, are interrelated such that , change ,„ 
one plan can dra m atica„ y affect other plans. (Thfrt wave plans ,„ 
perhaps less interrelated.) 

still 1n the planning process on the issue of concurrent 
P-essing. but we n ave m ade progress in narrowing th . options dur1ng ^ 
last S months. We ha ve set up a process to ana, y2e thoroughly ^ mflny 
Cp,« and interrelated Issues. , m confident ^ ^ ^ ^ 
- to a decision that represents the best approach for the 1990 census 

0f«ce configuration, the d i strib ution of e q u1p me nt, and the other issues 
related to concurrent processing. 

Data Products 

The fourth and final topic I w111 d1sCuss today „ datd 
disseminating data products is the key stone to a successful census fl „ 



<?5 



22 



17 

that, comes before is geared toward producing accurate data in a timely 
manner. Each of the previous topics I have discussed today relate to 
these two goals: accuracy and timeliness. 

I will begin by describing briefly our process for planning the 1990 
data products, which has been underway for some time. In 1982. we began 
an internal appraisal of the 1980 census data dissemination program. In 
1 9 82-1983, the National Conference of State Legislatures (NCSL) surveyed 
state legislative officials regarding the 1980 census P. L. 94-171 
(redisricting) computer tapes, microfiche, and paper printouts. In 
October 1983, the Census Bureau held a national conference of state 
officials and minority group members to discuss the P. L. 94-171 program 
and the NCSL survey results. 

In April 1984, we held a National Geographic Areas Conference to 
examine our approach to the definition, delineation, and reco-oition of 
geographic areas in the 1980 census and hew these activiti, should change 
for the 1990 census. In October and November of 1984 we * 'three 
Regional Geographic Areas Conferences to share the resu he National 

Conference with a wider audience and to gather additional .re tion on 
these fundamental units for data tabulation. 

In the period 1 9 84-1 9 85, we held 65 Local Public Meetings (at least 
one in each state); these meetings produced suggestions for planning 1990 
data dissemination media. Also in this period, the State Data Centers, 
state government agencies, and regional and local planning organizations 
provided cedents on their use of all 1980 data products and their 
recommendations for 1990. We held regional meetings with American Indians 
and Alaska Natives to obtain their suggestions and held community meetings 



26 



23 



18 



with Hispanic, Black, and Asian and Pacific Islander groups to collect 
similar recommendations. We have also discussed our plans with professional 
organizations and census advisory committees. In the near future, we 
will be consulting with other Federal agencies to learn their needs for 
data products . 

During April and May of this year we are conducting a series of 
product planning meetings around the country. In the fall of 1986 we will 
hold a conference to present our final report on the product meetings. 
Based on the thousands of recommendations we received in earlier meetings, 
we have developed proposals for 100-percent reports (those based on data we 
receive from all respondents), ustr tapes, and microfiche. We are discuss- 
ing these proposals at the product planning meetings. Many of these 
proposals also relate to the sample products (those based on data we receive 
from a sample of respondents) and we will also be discussing these at the 
meetings. We will discuss sample products more fully at the fall meeting. 

At this time, we propose that computer summary tapes, printed reports, 
and microfiche will be the primary means for distributing 1990 census data, 
as they were for 1980. Printed reports are an essential medium of data 
dissemination. While access to computers continues to grow, the Census 
Bureau is committed to making data available to all segments of the 
population. The printed reports, available through libraries, make this 
possibl e . 

Microfiche is a compact and relatively inexpensive way of making a 
large amount of data available in "eye-readable" form. Thus, we now plan 
to make selected summary tape files and block statistics available on 
microfiche. In 1970 and earlier censuses, block statistics were issued 




24 



19 

1n printed reports. In 1980, however, the number of blocks covered in 
the series increased to 2.5 million and printed reports would have been 
too costly to produce and too cumbersome to store. So we issued the 
statistics on microfiche. For 1990, we may have as many as 12 to 15 
million blocks. Printed reports for these would be even more costly and 
cumbersome than in 1980, so we will again plan to use microfiche. We are 
looking at ways to make the microfiche easier to use. One option is to 
use larger type, but the trade-off is that more fiche would be required 
to hold the data. The number of fiche also will increase due to the 
greater number of blocks. 

We are also evaluating other dissemination media for making data 
available. These media include flexible diskettes for microcomputers, 
laser disks, and on-line data systems. 

Since 1984, the Census Bureau has sold selected data products on 
flexible diskettes. Given the large quantities of data on the 1980 
census summary tape files and the limited capacity of diskettes, we have 
not considered preparing diskettes containing such data. For example, 
the 1980 census Summary Tape File 3 data for Montana are contained on one 
reel of magnetic computer tape but would require over 100 diskettes. We 
do plan to explore the possibility of downloading small subsets of summary 
tape files onto diskettes using test data tapes from the 1988 dress 
rehearsal census. If users react positively to such diskettes, we may 
offer similar products containing 1990 census data. In addition, we are 
considering the feasibility of producing smaller sets of data on diskettes 
on request. 



28 



25 



20 



Laser disks, similar to the compact disks being sold for home audio 
systems, offer considerable potential for data dissemination. Although 
small in size, they provide tremendous storage capacity. One laser disk 
will hold the contents of about four high-density computer tapes. We are 
currently evaluating laser disks to see if they are a viable option for 
our 1987 agriculture and economic censuses and the 1990 . decennial census. 

We initiated an on-line database service in 1984. Called CENDATA, 
it is available to the public through private sector database firms. At 
present, CENDATA is relatively small, containing mostly current summary 
data from ongoing surveys and product information. For 1990, we expect 
to expand the capabilities of CENDATA to provide graphics, an interactive 
mode for inquiry and order handling, and substantial amounts of additional 
timely summary data. Even with expansion, however, only a small part of 
1990 census data would be available through this system. 

Now, having described our thoughts on dissemination media, I will 
discuss a few of the other product issues we are addressing in our product 
planning meetings . 

We are proposing a number of changes in the format and sequencing of 
1990 data products to help us meet our goal of providing quality products 
with a minimum of delay. For example, we are proposing that the earliest 
products from the census exclude those statistical areas that will be 
defined on the basis of 1990 census results. These nongovernmental 
entities are metropolitan statistical areas and urbanized areas. These 
areas will appear in later products for 1990. 




26 



21 



For the same reasons, we are also proposing that we limit the amount 
of historic data in the initial release and concentrate most historical 
data in a special series to be issued later. 

Some users have expressed a preference for more reports that combins 
population and housing data. We propose this for all the 100-percent data 
products. Because products based on sample data are larger than the 
100-percent products, it may not be possible to follow the combined approach. 

Many 1980 census printed reports were much larger than their 1970 
counterparts, at least partially because of the decision to publish much 
more data for race and Spanish origin. We published about as much detail 
for the major race groups (that is, White; Black; American Indian, Eskimo, 
and Aleut; and Asian and Pacific Islander) and for the Spanish-origin 
population as we did for the total population. In planning meetings we 
have held so far, such as the Local Public Meetings, some data users have 
told us that this level of detail is not necessary in the printed reports. 
In our current product planning meetings, we are asking data users whether 
the amount of data needed by race and Spanish origin could be reduced in 
the 1990 reports. Regardless of the amount of detail shown in the printed 
reports, we still plan to tabulate as much data for race and Spanish-origin 
groups as in 1980, and these data would be available in the summary tapes. 

Finally, I will mention that we are considering eliminating both the 
computer summary files and the printed reports that show the most detailed 
cross-tabulations of population and housing data. Again, we are consider- 
ing this based on user recommendations. These data would be available as 
special tabulations on a reimbursable basis. We also plan to develop 




27 



22 



public-use microdata samples that would allow users to do their own 
detailed cross-tabulations. 

This is just a sampling of the many issues we must address in planning 
our 1990 census data products and we look forward to working with the 
Subcommittee further on these and other issues. Our tentative schedule 
calls for preparing the final program descriptions for both 100-percent 
and sample products by late 1986 and the detailed specifications for 
100-percent and sample products during 1987. 



Mr. Chairman, this concludes my testimony. In each of the four 
areas I have discussed today, we have faced or are still grappling «\'.\\ 
many complex issues. But we have made much progress in resolving out- 
standing issues and in making important changes since 1980. We are 
building a census plan that we believe will lead to success in our 
collection procedures, in automating census processes, and in disseminating 
useful data products in a timely manner. Now, I look forward to hearing 
the comments and suggestions of the other witnesses assembled here today. 



Cpnclusi on 




28 



Responses to Questions from 
Subcommittee on Census and Population 
to 

Dr. John G. Keane 

Director 
Bureau of the Census 
on 

Processing Plans for the 1990 Decennial 
May 1, 1986 



QUESTION 1 . In your testimony, you Indicated that the Census Bureau was 
considering cutting back on the tabulations of the 1990 Census. In 
particular you said that you would produce fewer tabulations for Blacks 
and Hispanics. You know that I have long believed that the census 1s a 
very Important source of Information about the progress of the people and 
especially those people who need the most help. Following are a few 
questions about your decision: 

Exactly what tabulations are you thinking of taking out of the 
publications? 

How do you expect to serve the needs of the people who need those 
tabulations? 



ANSWER: We have not made any decision on possible reductions but are 
explorl ng various approaches based on comments received at the local 
public meetings held 1n 1984 and 1985. Some data users at these meetings 
suggested that we reduce the amount of subjects or the number of smaller 
geographical areas shown for Spanish origin and race groups. 

Since one of our primary concerns 1s to meet the major data needs for 
Spanish origin and race, we are seeking additional advice on this matter. 
We are raising this Issue at a series of ten meetings on 1990 census data 
products being held 1n selected cities across the country. We plan to 
consult with the 1990 Advisory Committees on the Hispanic, Black, Asian 
and Pacific Islander, and American Indian and Alaska Native populations, 
and with ethnic leaders and community-based organizations. We also will 
keep this Subcommittee Informed of developments on this Issue and welcome 
your comments. 

If printed data on race and Spanish origin were reduced, the data would 
still be available from the summary tape files. We expect that the amount 
of 1990 rice and Spanish origin data available from the tapes should be 
about the same as 1n 1980. Also, to make data more accessible to all data 
users, we would consider expanding the amount of 1990 data on race and 
Spanish origin available on microfiche and other media. 

As was done with the 1980 census data, the State Data Centers and their 
affiliates, upon request, would compile 1990 data on Spanish origin and 
race from the sunmary tape files and make the data available to the data 
users at a nominal cost. 




29 



2 

Q UESTION 2. When the National Academy of Sciences' study of plans 
or the 1990 Census was Issued, our understanding was then that you would 
continue to support the work of this panel. W111 the panel continue Its 
work? W111 you asJc the NAS. to prepare another followup report? Will you 
then provide full funding for this study? 

He regard the NAS work as a very Important nonpartisan source of 
Information about the census. Has the Bureau agreed to continue funding 
1t? 



ANSWER; The National Academy of Sciences has made significant contributions 
toward the Census Bureau's establishment of research and operational priorities 
for the 1990 Oecennlal Census. These contributions are of particular 
Importance to programs such as decennial census undercount and adjustment 
research. 

He are contracting with the Academy to convene a special panel meeting 1n 
the late summer of 1986 to advise on the development of adjustment-related 
programs. The Academy will present a report of observations and recommendations 
to us in Oecember 1986. Recommendations will help us develop the decennial 
census adjustment standards. 

To ensure continued participation by the Academy, we are negotiating for 
a two-year extension of the existing contract. During the second year, 
this contract extension will provide for on-site consultation with Bureau 
of the Census research principals to ensure an opportunity for the exchange 
of Information on a schedule compatible with our commitment to critical 
milestone dates. 



QUESTION 3. I understand from your testimony that the tJureau has decided 
to use vendor lists as the main source for addresses to mall out the 
census forms. However, you will also be using the 1980 census lists 1n 
areas where the vendor lists are not considered to be very complete. 
Could you clarify this point? How will you select the areas of the 
country where you will use vendor lists and those where you will use 1980 
census lists? 



ANSWER: Vendor files will be evaluated to determine which are best for 
specific portions of the urban United States. As part of the evaluation, 
we will Identify those urban areas where clear-cut deficiencies exist 1n 
the vendor files. For these areas we will consider the use of the 
1980 census address 11st as a useful substitute or supplement to the vendor 
files. He are currently developing guidelines and criteria for evalu- 
ating, selecting, and augmenting purchased vendor files. He will provide 
the guidelines and criteria to the Subcommittee s^aff when they'have 
been completed. 



QUESTION 4. Many of your decisions relating to the processing of the census 
are based on the pretests. He have heard reports that the pretest 1n 
Los Angeles had to be curtailed due to poor response from the public. 
Could you please tell us what happened? How has It affected the test 
and what 1t will mean for the census? 



61-902 0-86-2 



33 



30 



ANSWER: We planned to conduct full-scale censuses 1n two offices 1n 
Central Los Angeles County— the North office serving 9 communities and 
the South office serving 12 communities. We also planned, as a contingency, 
to curtail some follow-up activities 1f mail -response rates were below 
expectations . 

Because of the extremely low mall-response rate 1n Los Angeles, we 
determined that we could not complete the test procedures throughout 
the planned test area with available resources. We decided to reallocate 
available resources from the South office to complete the test 1n the 
North office. We decided to continue the test 1n the North office as 
opposed to the South office because the North office contained more test 
objectives and evaluations. Also, the North office 1s smaMer and had a 
somewhat higher mall-response rate. 

It 1s difficult to conduct a successful promotional campaign 1n a test 
census environment without the massive national publicity and visibility 
normally available during the decennial census. This problem is particu- 
larly severe when the test area is only a portion of the market served by 
the metropolitan mass media. For this reason we projected a mall response 
rate (45-50 percent) for the Los Angeles test sites much lower than 1t 
would be for that area 1n a national census. 

The Initial mall-response rate 1n Los Angeles was even lower than we 
expected. At the time we had to make our decision (March 27), the rate 
for the South office was 24 percent and for the North office, 31 percent. 
By April 3, when we printed the nonresponse follow-up listings the rates 
were 30 percent and'38 percent, respectively. 

To help us evaluate the reasons for this low response rate, we surveyed 
people 1n both the North and South areas to determine their exposure and 
reaction to our promotion and community outreach activities. We have 
efforts underway to develop an effective national promotional campaign 
in 1990, and we will study what occurred in Los Angeles to suggest ways 
to Improve these efforts. 

Before our 1986 test census 1n Los Angeles, we had defined a number of 
objectives specifying what we planned to learn 1n conducting the test. 
Some of the new or modified procedures we were testing applied to both 
the North and South offices; a few were concentrated 1n the North office 
by design. We believe we will still be able to learn virtually all we 
had hoped to. 

For those objectives that apply to operations that occur before Census 
Day, we will have data from both the North and South offices. For those 
that apply after Census Day we will still have enough good Information to 
give us confidence 1n the conclusions we draw from the 1986 census of 
L.ds Angel es County. 



31 



4 



QUESTION 5a. Among the documents you sent us to read there were a number 
that set out some of the Issues that are related to processing the census. 
We greatly appreciate your sharing these with us even though they were 
preliminary. But they do raise a number of questions. How will the 
decisions you are making regarding the processi ng- affect the kind and 
timing of the evaluation of the census coverage? When will we learn how 
complete the census was? 



ANSWER: Planning for prompt coverage evaluation has been an Integral 
part of overall census planning for 1990. This 1s most evident in our 
early commitment to pursue the dual strategy of (1) conducting the most 
complete census possible and (2; measuring and being prepared to adjust 
for coverage error on a timely basis. The requirements of the evaluation 
program have been carefully considered throughout the planning process, 
and all decisions, Including those for processing, are thoroughly reviewed 
for their effects on the evaluation program. 

We believe that several components of our processing plans for 1990 will 
significantly Improve the accuracy and speed of our coverage evaluation 
program compared to 1980. The three most Important components affecting 
coverage evaluation are: (1) automating the address control file, (2) 
converting questionnaire data to computer-readable forms earlier (con- 
currently with data collection), and (3) keying the names of persons 1n 
census blocks 1n or adjacent to areas where we will conduct the post-enumeration 
survey. The major effect of these automated processes 1s to allow us to 
begin earlier than 1n 1980 to match names from the post-enumeration 
survey to the census and to automate most of this matching (Instead of 
doing 1t clerically as 1n 1980). 

As we proceed to Implement our recent processing decisions, we will deter- 
mine the exact timing for estimating coverage. We believe 1t is essential 
that the results of the coverage evaluation program be as timely as 
possible. 



QUESTION 5b. Have you decided to enter all of the names of the people 
into your computer? I understand that this has never been done before. 



ANSWER : No. We will enter surnames of householders 1n multlunlt 
structures for which a questionnaire was returned by mall. This automated 
operation replaces a clerical operation 1n the 1980 census, when clerks 
wrote surnames Into the address registers. The surnames are needed by 
follow-up enumerators who may run Into apartment "mixups" 1n multlunlt 
structures. For example, the enumerator might visit Apt. 3G and find 
that Mr. Doe says he has returned a questionnaire. The enumerator can 
look at his listing and find that Mr. Doe returned a questionnaire but 
1t was for Apt. 3K. 

We will also enter full names for the relatively small sample of persons 
who live 1n the areas that will be covered by our coverage evaluation 
survey (the post-enumeration survey) or areas adjacent to those covered 
by the coverage evaluation survey. Names are needed to automate the 




32 



5 



matching of persons from the survey to the census. Clerically matching 
names was one of the major problems 1n the 1980 Post-Enumeraton Program. 

Before the census, when we compile the address lists for the more rural 
areas, we will enter surnames for those addresses that are not street 
name and number (for example, rural route delivery addresses). We did 
this 1n 1980 also. For 1990, we will also have the ability to enter 
updates when the names of householders change. Names are an essential 
part of the mailing addresses 1n rural areas. 



QUESTION 5c. How will you protect the privacy of the people whose names 
are in the computer? 



ANSWER; We have established as one of our six major goals for the 
1990 census the maintenance of the confidentiality of Individual census 
Information. This 1s especially a concern as we move to Increased use 
of automation 1n the census. We will not Introduce automated systems 
unless we can maintain the security of the data. 

As 1n the past, we will make all employees aware of the Title 13, 
United States Code, confidentiality requirements Imposed on all 
Census Bureau personnel. We will control the number of people with 
authorized access to confidential Information and Implement safeguards 
to prevent unauthorized access. Although we will have addresses and a 
small sample of names on automated files, we will store the names and 
addresses separately from the respondent-supplied questionnaire answers, 
And we will program our computerized systems to monitor against unusual 
or unauthorized searches of data files. 



QUESTION 5d. Are you going to use the names to match the census returns 
witn other government files? Why are you doing this? What will be Its 
benefits and drawbacks? 



ANSWER: We do not Intend to match our census files to other government 
records to create a master data file of information about Individuals. 
During the census-taking operations, however, as a method of Identifying 
persons who were potentially missed, we are developing a program that 
will allow us to compare other files to the census. We will enumerate 
those persons that we verify were actually missed 1n the census. This 
program will allow us to use these other sources to improve the count of 
the population but only as protected under our guarantees of confidentiality. 




33 



6 



QUESTION 6a. In one of the documents the Bureau sent us there was a dis- 
cussion of how you might adjust the census returns to make up for an 
undercount. Now, .1 realize that this was just a staff paper but It does 
raise a nr: ( iber of questions. The paper presents a timetable for collecting 
Information about the quality of the census coverage and reporting 1t to 
you. If I understand 1t correctly, 1t would require you to complete the 
fleldwork of the census 1n July. The fleldwork of the census 1n 1980 was 
not finished until well Into the fall. Do you think you can make this 
time table? 



ANSWER: Our goal will be to complete field work 1n as many offices as 
possible 1n July. The Increased use of automation and Improvements 
In census procedures and 1n hiring, retaining, and managing enumerators 
may give us a better chance than 1n 1980 of closing offices this early. 
As field offices complete the enumeration, data collection for the coverage 
measurement survey will begin. We will be able to start this data 
collection on a flow basis to allow for some office closings later than 
July and still complete all processing on a timely basis. 



QUESTION 6b. The paper calls for keying lots of names of people. Do you 
think you will be able to do this? 



ANSWER: Yes. We have decided only to key surnames for householders 1n 
muitl unit structures who return their questionnaires by mall, a small 
sample of full names, and updates to names of householders 1n rural- 
route delivery areas. The full name keying will be done only for those 
blocks In or adjacent to areas where we are conducting the coverage 
evaluation survey. We anticipate no problems In performing this smaller 
keying workload In time to begin the coverage evaluation survey matching. 



QUESTION 6c. The paper says, "Under the strategy adopted by the Bureau, 
the director 1s to review the results of both the census and the coverage 
measurement survey, compare them with the established standards and 
announce a decision on whether the census will be adjusted." (page 130) 

In other words, the paper says that the Census Bureau director will 
decide whether to adjust the census after he or she knows what the 
effect of adjustment will be on the apportionment of the House of 
Representatives . 

Is this the Bureau's current policy? 




34 



7 

ANSWER: We will make our decision on whether to adjust the 1990 census 
fl gures after we know the results of our coverage measurement program and 
the quality of both the enumeration and the evaluation. Our major criterion 
will be whether we can Improve the counts by adjusting them. While we 
may know the effect of adjustment on apportionment at that time, that 
will -not be considered 1n making our decision. The Director will make a 
decision based on established standards that will be printed 1n the 
Federal Register before the 1990 census. 



QUESTION 6d. How will you decide whether to adjust the census? 



ANSWER: We will decide whether to adjust based on the results of our 
coverage measurement program and the results of the census. We will be 
looking at whether adjustment will Improve the counts. We will review 
the results of both the census and the coverage measurement survey and 
compare them with established standards. The key to this strategy 1s 
reaching consensus on the standards before the census 1s taken. We will 
publish proposed standards In the Federal Register well 1n advance of the 
census . 



QUESTION 6e. W111 you be up here soon asking us to change section 195 of 
Title 13 to give you the authority to use sample data for the apportionment 
of the Congress? 



ANSWER: Following the 1980 census, the courts reviewed whether Section 195 
prohibits the use of any data based on a sample. In three cases, City of 
Philadelphia v. Klutznlck , Carey v. Klutznlck , and Young v. KlutznTclt , 
the district courts Interpreted Section 195 to mean that sampling is 
prohibited only 1f 1t 1s the sole means of determining the population for 
apportionment. Based on these decisions we do not see any legal barrier 
to supplementing the count resulting from a full-scale enumeration by 
techniques that use sampling. If there 1s any ambiguity concerning that 
Interpretation, the law may need to be clarified before any action. 



QUESTION 6f. Now, we don't know who the Director of the Census will be 1n 
1990, That largely depends on the outcome of the 1988 election. Assuming 
that someone else was the director and that he or she was appointed by a 
president of a different party from your own, would you trust him or her 
to make the decision about adjusting the counts after 1t was known what 
the effect on apportionment would be? 



38 



35 



8 



ANSWER: The plan for determining whether or not to adjust has been designed 
s0 as not to be dependent only upon the Judgment of one Individual . By 
developing criteria that are agreed upon 1n advance, we are removing the 
need to trust the judgment of one specific person and/or any concern about 
whether the effect-upon apportionment 1s known to that Individual. We believe 
that the best plan for a decision must be based upon definite knowledge 
about the results of our coverage evaluation program and the quality of both 
the census and the evaluation. 



QUESTION 6g. Why wouldn't 1t be better to decide whether to adjust before 

census 1s taken and avoid the difficulty of having to make a contro- 
versial decision under a great deal of pressure? Wouldn't this allow for 
a fuller debate on the topic and more careful planning of how 1t w1l« or 
will not be done? 



ANSWER : We will decide whether to adjust based on the results of our 
coverage measurement program and the results of the census. Thus, we 
cannot decide until after the census. Also, because there are statistical 
problems associated with the program to develop measures of the undercount, 
we must know how well 1t works, and at what geographic levels we might be 
able to perform an adjustment, before we can make a responsible judgment 
about whether or not adjustment will Improve the census counts. There 1s 
ample time, however, for a full discussion and careful planning of the 
standards that will be used to make that judgment. 



QUESTION 7. Regarding the costs of the processing that you are proposing 
for the census--w111 the equipment that you are purchasing be cost effective? 
How much money will the government save from the processing alternatives 
that you are considering? 



A NSWER: Yes, we believe our equipment decisions will be cost-effective, 
lie will only Invest 1n automation that reduces costs or significantly 
Improves the census. While we cannot know at this time whether a specific 
automation decision will save money, we believe our decisions will lead 
to greater efficiency and to a quicker and more accurate census. Automating 
census operations will allow us to replace labor-intensive a/id error-prone 
clerical operations with automated techniques that are more accurate and 
control lable. 




36 



Ms. Oakar. Doctor, thank you very much for your comprehen- 
sive statement. If you can, I would like to request that you wait 
until we have this other panel and then join them for questions. Is 
that possible? 

Mr. Keane. I shall. 

Ms. Oakar. Thank you very much. 

Our next witnesses are Dr. Dan Horvitz, who is the executive 
vice president for Research Triangle Institute, which is a large 
nonprofit survey research organization based in North Carolina. 
Dr. Horvitz is also vice president of the American Statistical Asso- 
ciation. 

Prof. William Eddy is with the department of statistics of the 
Carnegie Mellon University, if he would come up, please. He has 
worked as a consultant for the Postal Service in the area of auto- 
mation. 

Ms. Judith Rowe is the associate director for research services of 
the Princeton University Computing Center. She is past president 
of the Association of Public Data Users and plays a prominent role 
among users in articulating the needs of university and private re- 
searchers. 

And Dr. Benjamin F. King is the director of survey methods for 
the Educational Testing Service. Dr. King serves as the chairman 
of the American Statistical Association's Advisory Committee to 
^e Census Bureau, and he has been a member of the panel on the 
1990 census convened by the National Academy of Sciences. 

We really do have a very distinguished group, and we would be 
happy to have your complete statements for our record. If you can 
summarize, it would be helpful so we could have more time for 
questions at the end. 

So we will start with you, Dr. Horvitz. 

STATEMENT OF DANIEL G. HORVITZ, EXECUTIVE VICE 
PRESIDENT, RESEARCH TRIANGLE INSTITUTE 
Mr. Horvitz. Thank you, Madam Chairwoman. 
I presume the statement will be included in the permanent 
record, and I will just make some comments. 

Ms. Oakar. Absolutely. Your entire statement will be included 
for the record. 

Mr. Horvitz. My remarks will be confined to the automated 
check-in system and the concurrent processing goal. Clearly the im- 
portant goals of the 1990 census are to increase the efficiency of 
the census, the quality of the census and the timing of the census, 
and current plans certainly are moving in that direction through 
greater use of automation, through decentralization of the process- 
ing facilities and through concurrent processing. 

In my opinion, the plans that the Bureau have been preparing 
have considerable potential for achieving greater accuracy. Cer- 
tainly the automated check-in control, the automated control file, 
can help considerably to reduce the coverage errors that have oc- 
curred in previous censuses. 

I was going to be talking about the potential for the computer 
edit and followup as a way of reducing content errors also over 
prior censuses. The remarks of Mr. Keane this morning suggest 



40 



37 



there is some reduction in the extent to which there will be com- 
puter edit followup as compared to previous censuses, and it is 
little disappointing from my standpoint that there is going to be 
much more clerical edit, which was the process in prior censuses. 
Based upon his remarks this morning, this is anticipated in the in- 
formation that was available to me with respect to the level of the 
current plans of the census. 

Certainly from my standpoint, through automation, one has an 
opportunity to address problems with the census with respect to 
coverage and content early in the process, and all of my experi- 
ences show, it seems, that the closer you get to the source of the 
data in terms of time and space, the better job you will do. You 
will detect errors earlier and you correct those errors earlier. Auto- 
mated processing helps this process considerably. So a reversion to 
a large segment of the population receiving only clerical edits I 
think puts us back where we were in 1980. 

My written testimony certainly supports the plans and I certain- 
ly applaud the Bureau's efforts. I just think that the Bureau has 
been much too tentative in moving strongly in the direction of au- 
tomation. I am surprised that we are here already in 1986 and not 
further along in the planning. Certainly it is possible that the 
Bureau, being challenged in 1980— with respect to the magnitude 
of the undercount in certain small areas and local areas — may 
have delayed excessively the effort available for the 1990 plans and 
that is unfortunate. The people who are the leaders at the Bureau 
with respect to planning, carrying out the census, and then answer- 
ing the challenges are the same people, and they just may be over- 
worked in that respect. 

The cost of computer hardware and the complex software needed 
to more fully automate the census may be major stumbling blocks. 
Still, from my standpoint, the real potential for gains in quality 
and in productivity justify the added expense. My own experience 
in large-scale surveys has shown a very noticeable gain in both the 
quality and the productivity, despite the fact that getting set up for 
automation, it being a complex process, can provide many frustra- 
tions. But after you have things in place, properly tested and work- 
ing, you forget those frustrations because of the gains that you see 
in the quality of the work that is being done. 

Concurrent processing provides, I think, a major advantage in 
the ability to follow up and correct what would be called fail-edit 
cases in a timely manner. I feel that the census, particularly now 
in hearing what I heard this morning, is really not planning to ex- 
ploit concurrent automation and the technology for concurrent au- 
tomation that exists, sufficiently, particularly with respect to the 
followup procedure of fail-edit cases, certainly with respect to the 
quality of the edit process itself, if it remains a clerical operation 
for a good part of the census. 

I have advocated that the technology exists for automating the 
process of followup in the telephone process by using computer-as- 
sisted telephone interviewing rather than paper and pencil tele- 
phone interviewing. Based on what I heard this morning from Mr. 
Keane, the telephone followup will be a paper and pencil operation. 
The computer-assisted telephone interviewing is a well-tested tech- 
nology. It does have costs associated with the fact that each inter- 




38 



viewer or enumerator working with a telephone must have a video 
terminal. The cost of video terminals are going down rapidly, and 
it seems to me that in the next few years the costs will be even less 
than they are today. 

As I see the use of automation, we would have automated check- 
in, we would film, we will use FOSDIC, we will have computer 
readable data available concurrently with the collection of the 
data. That computer readable data can be transmitted back to 
processing offices from, say, any central office where the main com- 
puting is being done, can be transmitted via telecommunication 
back to the computers to support the computer-assisted telephone 
interviewing and processing offices. It does not appear that the 
Bureau is even considering the use of transmitting data via lease 
lines and establishing a network of their processing offices in 
which data are transmitted that way rather than shipping. The 
plan is to ship questionnaires. I do not think that that is necessary 
in this day and age. 

I am also concerned that the technology exists and has been 
proven in the past to generate questionnaires for followup purposes 
as in the computer-assisted telephone interviewing. Basically the 
data would be in the computer. The enumerator would access that 
data in carrying out the interview, the followup interview to cor- 
rect that fail-edit aspect. There are cases where you have to follow 
up in the field with a personal interview for households that do not 
have telephones, and in those cases also it is possible to generate 
out of the computer a questionnaire ..for the enumerator to use. The 
enumerator can be in one remote office from where the basic data 
are, and that data can print out that instrument for the enumera- 
tor to use in the local office. 

Concurrent processing is, in my opinion, a very valuable ap- 
proach, and the Bureau certainly is making headway in that direc- 
tion. From my standpoint, it is not taking full advantage of the po- 
tential that exists with the technology that is available today. 

I would like to just read a few things that I feel are also essential 
for tl : eau to consider from my written testimony. 

I di expect to see any discussion of Bureau plans to measure 
the qu< • of the 1990 census data at this stage, although I do 
know th they do have plans for the post-enumeration survey. I 
would lik suggest that there is room to consider components of 
error in a sus in 1990 and would like to have the Bureau devel- 
op a plan r assessing the level of error introduced by all the 
check-in anu data processing procedures used to establish the final 
data file record for each person enumerated and each household. 
This is a component of the total error. Clearly, there are many 
other sources of error, particularly with the quality of the coverage 
in the field and the quality of the data that is provided by the 
householders to begin with. 

We have been conducting decennial censuses since 1790. Federal 
revenue sharing based on population and income data has en- 
hanced the level of interest in census results considerably. The 
pressure to produce an accurate census, but within stringent budg- 
etary constraints, is now heavier than ever. Yet the Bureau of the 
Census is expected to design and put in place procedures for col- 
lecting the data from about 100 million households in a short 2- to 



42- 



39 



3-month period, hiring and training a huge temporary staff to do 
so. From my perspective, the current time and budgetary con- 
straints for conducting decennial censuses are inconsistent with the 
expected level of quality in small area data. There is a clear need 
to consider more rational census alternatives than our current de- 
cennial mobilization. It is also important to begin to consider such 
alternatives now if we are to be in a position to bring about a 
change prior to the year 2000. 

One such alternative goes to the other extreme from current 
practice. It advocates collecting census data continuously in time 
and space covering approximately 1 percent of the population each 
month and approximately 10 percent each year. To explain, consid- 
er the 3,100 counties in the United States. It is possible to select 10 
nonoverlapping samples of counties, such that each sample would 
have approximately 10 percent of the total population and also be 
representative of all the counties in the United States. Each 
sample will consist of about 310 counties. One of the 10 samples is 
selected at random each year without replacement and a census 
conducted in the counties included in the selected sample. In this 
manner, each county would have a census taken once every 10 
years, which is also the case currently. 

The quality of the census data with this alternative should be 
considerably greater than at present for several reasons, not the 
least of which is the amount of staffing required and the greater 
use of permanent staff. The country could be divided up into 31 re- 
gions with a permanent supervisory staff in each region. The staff 
in each region would be responsible for conducting a census in 10 
counties each year, one county each month of the year, except, say, 
June and December. Thus the workload would be distributed uni- 
formly over time and space. 

An added advantage of this procedure is that accurate data on 
internal migration can be gathered, data which have never been 
available in intercensal years. Intercounty migration rates could be 
developed since each 310 county annual sample will produce data 
on in-migration for each census from all other U.S. counties and 
foreign countries, as well as data on out-migration from that 
county to 309 other U.S. counties. These current migration rates 
could be used to improve considerably intercensal year estimates of 
county populations over those computed currently. Thus, the allo- 
cation of Federal revenue sharing funds can be based on much 
better estimates of population in intercensal years than is possible 
with a census in all counties simultaneously only once every 10 
years. 

While I feel that costs should be less on some items and may go 
up on others with this process, the most important thing is the 
total cost would be distributed over each of the 10 years rather 
than incurring approximately 10 times the cost once every 10 
years. 

To summarize my remarks, I strongly support the current 
Bureau plans for the 1990 census, particularly the increased use of 
automation and the decentralization of processing facilities and the 
use of concurrent processing. I feel greater use of automation 
should be there. In particular, automation causes anybody engaged 
in operations as complex as the census to impose a much higher 



43 



40 



level of discipline to the data gathering and data processing enter- 
prises, resulting, at a minimum, in greater staff productivity and, 
potentially, in a significant increase in quality. It is that disciplin- 
ing, I think, that is an advantage in the end for large-scale under- 
takings such as this. That is really essential to produce the kind of 
census and the quality of census we desire. It is important to notice 
with automated procedures every case receives the same level of 
treatment in the census process. 

In view of the complexity of the automated systems contemplat- 
ed for the 1990 census, it is none too soon to begin implementation. 
I therefore urge the Bureau to come to closure on all aspects of the 
1990 census processing procedures and to move rapidly toward im- 
plementation. This subcommittee and the Congress could assist the 
decision process at the Bureau by first demanding careful docu- 
mentation of the computer hardware requirements to conduct an 
automated census at the planned level; second, based on this docu- 
mentation, negotiating a mutually acceptable set of requirements; 
and, third, providing some assurance to the Bureau that the sepa- 
rate budget line item for the computer hardware included in the 
mutually acceptable set of requirements will remain intact. I rec- 
ommend completion of this process no later than the end of this 
fiscal year. 

Finally, I view the mode of conducting decennial censuses anom- 
alous with the demand for even greater accuracy in the small area 
data reported. Therefore, I recommend that a serious effort be un- 
dertaken to examine alternatives to the decennial census for gener- 
ating accurate small area data on the U.S. population. 

Thank you. 

Mr. Ackerman [prasiding]. Thank you very much. I know that 
Chairman Garcia shares the same concerns as you do. 

[The statement of Daniel G. Horvitz and his response to written 
questions follow:] 



44 



41 



—RESEARCH TRIANGLE INSTITUTE 



STATEMENT 
OF 



DANIEL G. HORVITZ 
EXECUTIVE VICE PRESIDENT 
RESEARCH TRIANGLE INSTITUTE 



TO THE 

SUBCOMMITTEE ON CENSUS AND POPULATION 
MAY 1, 1986 



Mr. Chairman, I am Daniel Horvitz, Executive Vice-President of the 
Research Triangle Institute, a private, not-for profit research institute 
located in the Research Triangle park of North Carolina. From its in- 
ception in 1958, the Institute has had a very active statistical research 
program, serving both the public and private sectors. It currently has a 
staff of 150 engaged in the design and conduct of social surveys relating 
primarily to health, education, and economic well-being. I am a statis- 
tician with over 35 years of experience in survey research. It has been my 
privilege to serve on the American statistical Association Advisory 
Committee to the Bureau of the Census and I am pleased to have this oppor- 
tunity to coranent on the Bureau's plans for processing the 1990 Decennial 
Census. My remarks will be confined to the Bureau's technical papers 
covering automated check-in systems (Topic 2) and concurrent processing 
(Topic 3). 

The primary challenges of the 1990 Decennial Census are to provide 
population data of higher quality in more timely fashion, and to do so in 
a more cost effective manner than has been achieved in the past. To meet 
these challenges the Bureau's current plans for processing procedures 



Post Office Box 12194 



Research Triangle Park. North Carolina 277G9-2194 



Telephone: 919-541-6000 




42 



-2- 

encompass several major innovations including greater use of automation, 
less centralization of processing facilities, and implementation of data 
processing in parallel with data collection. 

The increased use of computer technology currently contemplated in- 
cludes an automated address control file (acf), automated check-in of 
questionnaires using bar code technology, automated generation and mailing 
of reminder post cards to nonrespondents, automated generation of non- 
response addresses for field follow-up, automated editing of completed 
questionnaires for inadmissible, inconsistent or missing items, and auto- 
mated generation of fail-edit cases for either correction by an internal 
Edit Review Unit or correction by means of telephone follow-up (or per- 
sonal visit follow-up, if necessary) with the household in question. The 
1980 Decennial Census used manual procedures for all of these components 
of the collection process. Reminder post cards were not used in 1980. 

Accompanying the contemplated increased use of automation in con- 
ducting the data collection phase of the 1990 Decennial Census is an exam- 
ination of the potential for greater decentralization of the data pro- 
cessing functions. In 1980, all completed questionnaires were shipped to 
one of three processing centers where the data were converted to computer 
readable form. Cameras were used to take a picture of each questionnaire, 
the film was processed, and finally the images on the film were converted 
to computer readable data using POSDIC (Film Optical Sensing Device for 
Input to Computers). 

This same three-step process is contemplated for 1990 and is desig- 
nated as FACT 90 by Census staff. The feasibility of increasing the 
number of FACT 90 locations from the three used in 1980 to as many as 24 
in 1990 is being analyzed by Bureau staff. It should be noted that it is 



46 



43 



not necessary to confine all three FACT 90 steps to the came location. 
For example, cameras can be located at a large number of Processing 
Offices (PO's) to film the completed questionnaires. The undeveloped film 
can then be sent to FOSDIC centers for the remaining two FACT 90 steps. 
Bureau staff are also examining the feasibility of deploying cameras to as 
many as 50 locations with the film being developed and converted in no 
more than three locations. 

Conversion of the questionnaire data to computer readable form in the 
1980 Decennial Census did not begin until the data collection phase was 
complete. The plans for greater automation in 1990, including computer 
editing of questionnaires with telephone and personal visit follow-up of 
fail-edit cases that cannot be resolved otherwise, dictate the use of 
concurrent or parallel processing rather than the sequential procedure 
used in 1980. The decision to implement automated processing of completed 
questionnaires much earlier for the 1990 Decennial Census was made last 
fall at the Bureau's Decennial Census Decision Conference (DCDC). An 
extremely important consequence of this concurrent processing decision is 
the potentially significant improvement in the quality, timeliness, and 
cost effectiveness of the 1990 Census. 

There is little doubt in my mind that the current plans for gathering 
and processing the 1990 Decennial Census have considerable potential for 
achieving greater accuracy than in the past. The use of automated systems 
for control of census operations involving mailings and field assignments, 
questionnaire check in, and telephone and field follow-up can reduce 
coverage errors significantly. These systems together with the planned 
computer edit for missing or inaccurate questionnaire entries should also 
result in considerable reduction, compared to prior censuses, in the 




44 



number of clerical level staff required. The real beauty of these 
proposed systems is that they address actual errors occurring in both 
coverage and content early on in the process. They also have considerable 
potential for preventing errors that might otherwise occur. It has been 
my experience that better quality data are produced when systems are in 
place to detect and correct errors as close in time and space to their 
source as possible. 

It should be recognized, however, that the proposed systems are 
complex. A number of major technical and logistical decisions remain to 
be made including the number and location of Processing Offices, the 
hardware and software required, the type and numbers of personnel to be 
recruited and trained, and the telecommunication network to be 
established. Currently, Bureau staff are addressing these issues through 
analysis of a series of action plans, using a set of common assumptions. 
In fact, the initial results of this analysis for the first 10 of these 
action plans were completed on March 14, 1986. 

As indicated above, I heartily support the Bureau's automation and 
concurrent processing initiatives for the 1990 Census and congratulate 
their efforts. I have not identified any aspect with which I have major 
disagreement. On the other hand, it is quite possible that the Bureau has 
been too tentative about moving strongly in the direction of automation. 
I am surprised that it is already 1986 and that the Bureau is not further 
along in its planning than it appears to be at this moment. The demands on 
Bureau staff to address challenges and issues relating to the 1980 Census 
under count may well have delayed earlier examination of the automation 
potential for 1990 at a sufficiently high intensity level. There may 
exist, among some Bureau staff, a feeling of satisfaction with the status 
quo, a strong inertial force often arising in the face of unwanted change, 
which may also have contributed to the slow rate of progress. 



4 ; 8 



45 



-5- 



There is little doubt that the cost of computer hardware and the 
rather complex software development requirements can be major stumbling 
blocks to the adoption of a fully automated census in 1990. For these 
reasons I feel the cost and quality tradeoffs deserve to be developed and 
studied in some depth. While the hardware costs are far from trivial, 
particularly in the face of their relatively short period of use in a 
census, I am confident the gains in quality and productivity will justify 
the added expense. My own experiences with the adoption o£ automated 
procedures for use in the conduct of complex sample surveys have often 
produced frustrations initially, but then have resulted in such noticeable 
gains in quality and productivity that I now have difficulty remembering 
that there were any initial problems. 

The development of an accurate, integrated system design for an 
automated 1990 Census with concurrent data processing as currently 
envisaged might best be accomplished using currently available computer- 
aided tools for systems development (such as Case 2000 or Excelerator). 
These tools use structured analysis and rigorous design techniques in a 
computer-aided environment that can lead to a significant increase in the 
quality of the detailed specifications, and hence in the overall quality 
of the ultimate system. I am personally not at all conversant with these 
tools, but mention them here since they might well provide the essential 
ingredients for addressing in an efficient and accurate manner the 
perceived complexities of a fully automated census. 

A major advantage of concurrent data processing is the ability to 
follow-up and correct fail-edit cases in a timely manner. These are 
completed questionnaires which contain errors in the information provided 
for some items or for which the data are missing entirely on other iter",. 




46 



-6- 



Current plans call for resolution of such fail-edit cases by an internal 
Edit Review Unit. In the event detected errors cannot be resolved by 
Census staff, the household will be contacted by telephone and new, 
presumably correct, data generated for the fail-edit items. Personal 
visits will be made to those households with unresolved fail-edits which 
do not have a telephone. Since the originally completed questionnaires 
will be checked in and stored in Processing Offices, this procedure 
presents logistical problems when a fail-edit case designated for personal 
visit follow-up is referred to a District Office (i.e. a data collection 
office) which is not also a Processing Office. Since there is no original 
questionnaire available in these District Offices (DO's) only a printed 
list of the items that failed-edit will be available to the interviewer. 
While Bureau staff recognize that ne;/ questionnaires containing the 
accepted original responses, but with the fail-edit items left blank, can 
be computer-generated for use in personal visit follow-ups in these latter 
DO's, it does not plan to do so for the 1990 Census, ostensibly because 
rapid full name capture is required, but not considered feasible. 

In my opinion, there is much to be said in favor of generating new 
questionnaires, each with the accepted original responses but with the 
fail-edit items left blank, for use by the Census enumerators in personal 
visit follow-ups ol fail-edit cases. The technology has been available 
for some time. My own experience, as director of the large National 
Medical Care Utilization and Expenditure Survey (NMCUES) in 1980, includes 
computer generation and printing, at remote sites, of two data collection 
instruments for use in subsequent interviews. The data generated for both 
instruments had been collected in earlier interviews. One of these 
instruments, called the Control Card, was a preprinted form, and contained 




47 



-7- 



all the demographic data about the family unit. Bar code identification 
consistent wit-i the family unit identification number in the computer was 
also printed on the Control Card for automated check-in and control 
purposes. 

The second instrument, called the Summary, was formatted entirely by 
the computer and contained data on all the medical services received and 
reported for each member of the household during prior interviews* There 
were fail-edit items in both forms. For example, the respondent may not 
have known the charge for a doctor's office visit reported in the previous 
interview because the bill had not yet been received at the time the 
respondent was interviewed. Therefore, the Summary for the current 
interview included all the information about the specific office visit and 
requested the interviewer to ask for the charge for that visit. The NMCUES 
experience in obtaining updated data on the Control Cards and Summaries 
was generally excellent. Essentially the same technology can be used in 
the 1990 Census, in my opinion, to generate questionnaires for field 
follow-up of fail-edit cases requiring a personal interview. Name capture 
seems unnecessary since the enumerator will have all the originally 
reported data for use in verifying that a particular family unit is the 
same one that sent in the original questionnaire. 

Similarly, the Bureau staff recos ' " Computer Assisted 

Telephone Interviewing (CAT!) can be use ' interview of fail- 

edit cases, but does not plan to do so for the 1990 Census, without 
belaboring the point, CAT I technology has also been around and proven for 
some time. It's value over paper and pencil telephone interviews is 
clear. The costs associated with providing the CATI interviewers with 
video terminals and the essential minicomputer support is most likely the 




48 



primary reason for the Bureau's reluctance to use CATI Cor this purpose. 
A careful cost and quality comparison of the two telephone interview 
procedures for fail-edit cases before ruling CATI out would be very 
useful. Since there are now a rather large number of private survey and 
market research facilities with CATI capabilities, the Bureau might also 
give serious consideration to contracting with such facilities for 
telephone interview follow-up of fail edit cases. This would eliminate 
the need for a large investment by the Bureau in computer hardware to 
support the use of CATI for follow-up of fail-edit cases. 

While I did not expect to see any discussion of Bureau plans to 
measure the quality of the 1990 Census data at this stage, I would like 
to suggest that the Bureau develop a plan for assessing the level of error 
introduced by all the check-in and data processing procedures used to 
establish the final data file record for each person enumerated and each 
household enumerated. 

We have been conducting decennial censuses since 1790. Federal 
revenue sharing, based on population and income data, has enhanced the 
level of interest in census results considerably. The pressure to produce 
an accurate census, but within stringent budgetary constraints, is now 
heavier than ever. Yet the Bureau of the Census is expected to design and 
put in place procedures for collecting the data from 100 million 
households in a two month period, hiring and training a huge temporary 
staff to do so. From my perspective, the current time and budgetary 
constraints for conducting decennial censuses are inconsistent with the 
expected level of quality in small area data. There is a clear need to 
consider more rational census alternatives than our current decennial 
mobilization. It is also important to begin to consider such alternatives 
now, if we are to be in a position to bring about a change prior to the 
year 2000. 




49 



-9- 



One such alternative goes to the other extreme from cuccent practice. 
It advocates collecting census data continuously in time and space 
covecing approximately one percent of the population each month and 
approximately 10 percent each year. To explain, consider the 3,100 
counties in the united States. It is possible to select 10 non- 
oveclapping samples of counties such that each sample will have 
approximately 10 percent of the total population and also be 
representative of all the counties in the United States. Each sample will 
consist of about 310 counties. One of the 10 samples is selected at 
random each year without replacement and a census conducted in the 
counties included in the selected sample. In this manner, each county 
would have a census taken once every 10 years, which is also the case 
currently. 

The quality of the census data with this alternative should be 
considerably greater than at present for several reasons, not the least of 
which is greater use of permanent staff. For example, the country could 
be divided up into 31 regions with a permanent supervisory staff in each 
region. The staff in each region would be responsible for conducting a 
census in 10 counties each year, one county each month of the year, 
except, say, June and December. Thus, the census workload would be 
distributed uniformly over time and space. 

This census procedure has an important added advantage. Accurate data 
on internal migration can be gathered, data which have never been 
available in intercensal years in the past. Intercounty migration rates 
can be developed since each 310 county annual sample will produce data on 
in-mig ration for each census county from all other U. S. counties (and 
foreign countries), as well as data on out-migration for each census 




50 



-10- 

county to 309 other U. s. counties. These current migration rates can be 
used to improve considerably intercensal year estimates of county 
populations over those computed currently. Thus, the allocation of 
Federal revenue sharing funds can be based on much better estimates of 
population in intercensal years than is possible with a census in all 
counties simultaneously only once every 10 years. 

Costs for this census alternative should be less in view of a 
significant reduction in staff hiring and training costs compared to the 
current census procedure. Equipment costs, including computer hardware, 
should also be much less. There could be some cost increases associated 
with data processing and data dissemination. More importantly, the total 
cost would be distributed over each of the 10 years, rather than incurring 
approximately 10 times the annual cost once every 10 years. 

Of course, one could dismiss this alternative quickly in view of 
constitutional requirements with respect to apportionment. On the other 
hand, if this suggested census alternative has sufficient merit on all the 
usual grounds by which census methods are evaluated, then a real potential 
for resolution of the apportionment issue exists, albeit not without 
considerable effort. 

To sumnarize my remarks » I strongly support current Bureau plans for 
the 1990 Census, particularly the increased use of automation, the 
decentralization of processing facilities and the use of concurrent 
processing. The greater use 0 f computer technology in the 1990 Census 
will impose an essential and significantly higher level of discipline to 
the data gathering and data processing enterprises, resulting, at minimum, 
in greater staff productivity and, potentially, in a significant increase 
in quality. It is important to recognize that, with automated procedures, 



54 



51 



-ii- 



every case receives the same level of treatment in the census process, 
ror these reasons, I recommend the use of CATI for telephone follow-up of 
fail-edit cases and the use of computer generated questionnaires for those 
fail-edit cases requiring personal visit follow-up. In the face of strict 
limitations on ths 1990 computer hardware budget, I recommend the use of 
contractors for CATI telephone follow-up of fail-edit cases. I also 
recommend the Bureau develop procedures for measuring the level of error 
introduced into the 1990 Census by the check-in and data processing 



in view of the complexity of the automated systems contemplated for 
the 1990 Census, it is none too soon to begin implementation. I therefore 
urge the Bureau to come to closure on all aspects of the 1990 Census 
processing procedures and to move rapidly toward implementation. This 
Subcommittee, and the Congress, could assist the decision process at the 
Bureau by, first, demanding careful documentation of the computer hardware 
requirements to conduct an automated census at the planned level; second, 
based on this documentation, negotiating a mutually acceptable set of 
requirements; and, third, providing some assurance to the Bureau that the 
separate budget line item for the computer hardwate included in the 
rautually acceptable set of requirements will remain intact. I recommend 
completion of this process no later than the end of this fiscal year. 

Finally, I view the mode of conducting decennial censuses anomalous 
with the demand for even greater accuracy in the smtll area data reported. 
Therefore, I recommend that a serious effort be undertaken to examine 
alternatives to the decennial census for generating accurate small area 
data on the U. S. population. 

This concludes my remarks. 



phases. 




52 



ANSWERS TO QUESTIONS ON 1990 CENSUS TESTIMONY 



Prepared by Daniel G. Horvitz 

Executive Vice-President 
Research Tciangle Institute 

for the 

SUBCOMMITTEE ON CENSUS AND POPULATION 
of the 

COMMITTEE ON POTT OFFICE AND CIVIL SERVICE 
U.S. HOUSE OF REPRESENTATIVES 



Question 1. Coulci you tell us some more about how the Census Bureau could 
use computer-assisted telephone interviewing in the decennial? 

— How expensive do you think it would be to apply this technology 
to the 1990 Census? 

— Up to now has it been used for any operation as large as the 
decennial? 

— What leads you to believe that it would be practical for the 
l isus? 



In my testimony, I recommended the use of Computer-Assisted Telephone 
Interviewing (CATI) for telephone follow-up of fail-edit cases and the use 
of computer generated questionnaires for those fail-edit cases requiring 
personal visit follow-up. in order to implement these recommendations, a 
specific level of automation must be planned and in place. Thus, the 
context of my CATI recommendation assumed a strong effort to minimize 
procedures which would require any further manual handling of census 
questionnaires, once they had been received and photographed in a Census 
district, regional or processing office. The level of automation 
contemplated by me for the use of CATI follow-up of fail-edit cases was 
consistent with, and among, the automation alternatives under review at 
the Bureau. Specifically, it included: 

1. An automated address control file. 

2. Automated check-in of questionnaires using bar code tecnnology. 

3. Filming all completed questionnaires at the check-in office. 

4. Shipping all film to FACT 90 locations for conversion to 
computer readable files and further computer processing. 



53 



-2- 



5. A computer communication network linking the FACT 90 computer 
centers with the regional processing/field offices. The latter 
offices need sufficient computing power to support a CATI 
operation, as well as other data processing needs. 

6. Automated editing of completed questionnaires at the fact 90 
computer centers and automated generation of a file of fail-edit 
cases for follow-up, either by CATI or by personal visit, with 
the household in question. The file should include all accepted 
original responses, but with the fail-edit items left blank. 

7. Automated transfer of the fail-edit cases via the computer 
communication network to the regional processing/field offices. 

8. use of CATI in the regional processing offices to follow-up 
fail-edit cases that have telephones. Specifically, this 
implies an ability by the CATI interviewers to access, in turn, 
each of the unresolved fail-edit cases stored in the regional 
processing office computer. 

9. Automated generation of questionnaires containing the reported 
data in field offices for field follow-up of fail-edit cases 
requiring a personal interview. 

10. Automated transfer of the CATI resolved fail-edit cases via the 
computer communication network from the regional processing 
office to a FACT 90 computer center. Fail-edit cases resolved 
by personal interview, using a hard copy of the questionnaire, 
would be processed in the field office in the same manner as 
original hard copies. 

The technology for the level of automation implied by the above 
exists. Subsets of the 10 components have been used in the past, either in 
prior censuses by the Bureau or in sample surveys by private sector survey 
research organizations. The major advantages to this level of automation 
are increased accuracy, timeliness and productivity, in my opinion. The 
major stumbling blocks are cost for the computer equipment, the ability to 
write and test the required software, and a willingness by the Bureau to 
commit to putting it in place for the 1990 Census no later than September 
30, 1986. 

The Bureau staff may feel the CATI portion will not be workable 
without having to key names of heads of households at check-in in order to 
be able to subsequently match follow-up cases with the correct household 
during the follow-up interview. The same potential matchinq problem 
exists for follow-up of fail-edit cases requiring a personal interview. 
This is not a problem for either the CATI follow-up or the personal 
interview follow-up cases, in my opinion. As outlined above, in items 6 
and 9, the CATI interviewer, and the personal visit follow-up interviewer, 
will have all the originally reported data for use in verifying that a 
particular family unit is the same one that sent in the original 
quest ionnai re. 

If the Bureau still has concerns about the ability to match fail-edit 
follow-up cases in an automated environment (i.e. without reference to the 
original hard copy), then I suggest the original questionnaire be 
structured so that the first initial and first four letters of the surname 




54 



-3- 

of the head of household can be coded by the household cespondent foe 
subsequent FOSDIC conversion to computer ceadable form. This additional 
information should be sufficient for linking CATI and personal interview 
follow-up cases to the correct original household without reference to the 
original hard copy questionnaire. I doubt that the Bureau has even 
considered requesting householders to special code surnames for subsequent 
FOSDIC conversion to computer readable form, yet current 1990 Census plans 
call for a significant amount of manual name keying. 

I do not have access to the essential parameter values to know just 
how expensive using CATI for follow-up cases might be. I suspect the 
major additional cost item will be the computer hardware. There are cost 
savings since the follow-up data will not have to be keyed. There will 
also be savings associated with a shorter overall time period to complete 
the data collection. Since the hardware will only be needed for a short 
time, the Congress should authorize a separate hardware budget that would 
permit the Bureau to lease the hardware it requires. 

Certainly, CATI has not been used for any operation as large as the 
decennial census. It has, however, been used in very sizeable operations, 
operations which have been as large as might be required for telephone 
follow-up of fail-edit cases by a census regional processing office. 
There is a tendency to discount the potential of census alternatives which 
have proven effective in sample survey operations by pointing out that the 
decennial census is very large. This is not a valid argument when, in 
fact, the same procedures are merely being duplicated across a large 
number of regional or district offices. The important question to ask is 
whether the technique or procedure is feasible and cost effective for a 
regional/district office. If it can be implemented for one office and 
work well, then it should be relatively easy to duplicate across the total 
set of census offices. For this reason, I don't consider the size of the 
census to be relevant to whether CATI or the other automated procedures 
listed above can be implemented or not in the 1990 Census. 

I am convinced that the level of automation listed above is practical 
for the census, considering that only tested technology, including CATI, 
is suggested. The savings in manpower costs incurred with manual 
procedures may not be a sufficient tr^eof f against the hardware costs 
incurred with the proposed automated procedures , however. It should be 
noted that hardware costs nave been coming down fairly rapidly of late. 
It should also be recognized that automation could yield a significant 
increase in the quality of the census, so that any evaluation of the 
practicality of a particular automation alternative should be in terms of 
its cost effectiveness relative to manual procedures. 

Finally, it is important, in evaluating automation alternatives, to 
compare total systems, in ray opinion. It was not very evident from the 
background papers prepared by Bureau staff that total systems were being 
compared. I have a distinct impression that system components were 
examined piecemeal, and that they were evaluated only at the margin 
relative to I960 Census procedures. 



Question 2. How important is concurrent processing to the success of the 
1990 Census? What will happen if the Bureau is not able to use this new 
system? 



58 



55 



-4- 



In my opinion, concurrent processing makes a great deal of sense. It 
can improve the timeliness of census reports considerably and at the same 
time reduce the cost of the census. I do not see how we can accept a 
lesser standard than that obtainable with concurrent processing. If the 
Bureau is not able to use concurrent processing, I can only conclude that 
the 1990 Census will be conducted no differently than the I960 Census, 
and, as a consequence, with either less quality or, at best, with no 
change in quality. 



Question 3. The background papers we received from the Census Bureau 
suggest that the Bureau might nave to key the names of many people into 
its computers if it wants to complete the census on time and also do a 
full evaluation of its coverage. 

Do you think that this can or should be accomplished? 

— Why is this needed? 

— How would it affect the privacy of people who answer census 
questionnaires? 

It is my understanding that the Bureau will key names to assist in 
subsequent linking of households in the post enumeration survey with 
households in the census. With more accurate matching, a better measure 
of the quality of the census will be obtained. There may also be other 
reasons associated with matching or linking of household records. As 
suggested above, if the Bureau considers keying names to be necessary then 
I don't understand why the Bureau has not come up with a better name 
capture technology, just as it came up with the POSDIC technology to 
capture the census data from the census questionnaires some years ago. I 
consider the development of a system whereby census respondents would use 
a POSDIC readable coding scheme on the questionnaire to enter the initials 
and four or five letters of the surname of the head of the household to be 
both feasible and practical. Thus, there would be no need to key the nenue 
data, since it would be read by POSDIC equipment along with the factual 
data on the questionnaire. If name data for all persons in the household 
are required, then the POSDIC readable coding scheme should be useO for 
all. It should be noted that complete names need not be coded on th«? 
census questionnaire since the other census data known for a person ca»^ be 
treated as part of the signature of that person for matching purposes. 
My preference is for a scheme such as this for name capture rather than 
keying the names. 

I doubt that keying the names would affect the privacy of census 
respondents. The Bureau has always put more than adequate procedures in 
place to protect privacy and confidentiality. I am confident the Bureau 
would do no less in this situation. 




56 



Mr. Ackerman. We will now hear from Prof. William Eddy, De- 
partment of Statistics, Carnegie Mellon University. 

STATEMENT OF WILLIAM EDDY, DEPARTMENT OF STATISTICS, 
CARNEGIE MELLON UNIVERSITY 

Mr. Eddy. Mr. Chairman, I am pleased to appear here today to 
comment on the preparatory work of the Census Bureau for the 
1990 census. The remarks that I am going to make are excerpted 
from a written statement that I have submitted and I would like to 
have inserted in the record. 

My comments are going to be divided into four broad parts: one 
part related to the automated address file, one part related to the 
automated check-in, a third part related to concurrent processing, 
and a final part which relates more generally to the use of comput- 
ers and automated technology in the conduct of the census in the 
year 2000. It is not too soon to begin to think of that. 

Although my comments are directed at technological issues, it is 
generally my impression that the most serious problems faced by 
the Census Bureau are managerial. The Census Bureau has demon- 
strated in the past its ability to solve these managerial problems, 
and therefore I think we need to challenge them to address these 
technological problems more directly. The potential benefits from 
the use of advanced technology in reduced costs, more accurate re- 
sults and more timely results can be very large. 

First with respect to the automated address file, every form dis- 
tributed by the Census Bureau for the 1990 census will have a bar 
code label attached to it. Presumably the bar code labels will be 
generated by computer from the address control file. Thus, when a 
form returns, the bar code can be read automatically and the ad- 
dress control file can be updated to reflect the receipt of the form. 
This has such obvious benefits over the previous 1980 census proce- 
dures that the Census Bureau should be commended for its use. It 
will be much easier to maintain an up-to-date address list. When 
Bureau personnel learn of errors in the existing list, they can 
make changes quite quickly, easily, uniformly, permanently. This 
is just a vast improvement over the previous procedures which in- 
volved handwritten modifications to a printed listing. 

On the other hand, there is an implied capability to communi- 
cate changes made to this file to other levels of the Census Bureau 
hierarchy and, I presume, to a central copy of this address control 
file. I am unable to discover that the Census Bureau has any plans 
to develop an automatic capability to allow this transfer of infor- 
mation. 

With respect to the automated checkin, some of the documents 
that I was sent for review contained the following statement: 
"Would personal computers used in wanding" — that refers to read 
the bar code— "have more use after 1990 than laser sorters?" These 
are two alternative technologies. I think it is clear that personal 
computers are more likely to be useful than specialized laset sort- 
ing machines. In the 1986 test census the personal computers I 
think the question refers to were IBM PC/XT machines. The IBM 
PC technology was introduced in 1981 and by 1991 when these per- 
sonal computers become available for other uses within the Census 



60 



57 



Bureau they will be 10 years old. As is a problem with a lot of com- 
puter equipment, I think about amortizing my hardware over a 
much shorter period of time, say 4 years, because of the rapid pace 
of technology and, in fact, the IRS only requires that it be amor- 
tized over 5 years. The notion that IBM PC s can be used for any 
useful purpose by the Census Bureau in 1991 is absolutely ludi- 
crous. 

With respect to concurrent processing, I am surprised to discover 
that the Census Bureau has not done this processing in the past. 
From a management point of view it is obvious why they have not. 
On the other hand, the need for quality control seems to me to re- 
quire processing of forms at essentially the same time as the forms 
are checked in, and I am very pleased to see the Bureau is moving 
in this direction and I would like to encourage them to continue 
more rapidly in that direction. 

I would now like to turn to the future census, the 21st Century. 
We all know that the decennial census is the single most important 
statistical activity of the Federal Government, and our Census 
Bureau has a long and honored tradition of doing the best possible 
job for the lowest possible cost. Along the way the Census Bureau 
has been a harbinger of computer technology for the Federal Gov- 
ernment and for the nation as a whole. However, iased on my un- 
derstanding of the planning for the 1990 census, this leadership 
role is changing. I fear that we may end up using 1970 technology 
during the census of the year 2000. I would, therefore, like to make 
a few suggestions which bear further exploration in an effort to 
move the census into the technological future. Of course, by the 
year 2000 it will be the present or even the past. 

With respect to data collection, a review of the Census Bureau's 
planning documents that I have seen indicates no attempt to col- 
lect the raw data by the use of computer technology. Dr. Horvitz 
referred to a currently existing technology. I am thinking of some- 
thing even more advanced. It is easy to argue that it is impossible 
to use computer technology in 100 million homes in the United 
States. I would challenge the Bureau to rethink this question. 

The United States has the world's largest and most widely avail- 
able data communications network, the telephone system. A large 
savings in manpower could be achieved if each respondent were to 
telephone the collection office and enter their responses directly 
into Census Bureau computers. The technology exists today to do 
this. Such a procedure does not even require that a respondent 
have a telephone. Merely access to a telephone. A further advan- 
tage, of course, is that respondents can initiate the call themselves 
rather than having some impersonal Bureau computer call them, 
and I presume that would improve the quality of the data. 

With respect to data transfer, I was unable to find any hint that 
the Census Bureau has a plan to use modern technology for data 
transfer among its various offices such as computer communica- 
tions networks and, in fact, in a specific documents referred to in 
my written testimony I found a flow chart adescribing the Census 
Bureau's plans for what they call basic urban processing, and in 
that document the symbol they used to indicate the transfer of 
data appears to me to be a railroad ore err of the type that were 
used in mines in this country in the last century. 




58 



Computer communications networks are ubiquitous in the 
United States and have been for a decade or more. They range 
from small local area networks which are typically privately owned 
and cover small geographic areas to wide area networks which are 
typically leased from the telephone company — I guess I should say 
companies — and are used for long haul data transmission over 
thousands of miles. 

Two specific examples that were both developed and supported 
by Federal Government agencies are the Advanced Research 
Projects Agency [ARPA] net and the National Science Foundation 
NSF net. In a perfect system the Census Bureau microfilm system, 
the film optical sensing device for input to computers or FOSDIC, 
would be irrelevant; the data would be entered directly into a com- 
puter and could be stored on magnetic media or on laser disks. 
FOSDIC is one more example of the Bureau's predisposition toward 
tried and true technology. FOSDIC was originally developed in the 
mid 1950's for the 1960 census. In 1990 it is going to be refurbished 
and called FACT90. It will be 35 years old in 1990. In the year 2000 
FOSDIC will be 45 years old. Surely the Census Bureau should lay 
plans to move from its paper-based methods and its paper-processs- 
ing techniques to something a little more modern. Sadly, I find no 
evidence to suggest this is true. 

In particular, I believe that the Census Bureau must take a seri- 
ous look at optical character recognition technology for use in 
future censuses. It is certainly too late to incorporate this technolo- 
gy in the 1990 census, but it is not too early to begin thinking and 
experimenting with its use for the year 2000. 

The USPS has, I believe, extensive experience with OCR technol- 
ogy in its ZIP plus 4 program. 

I would like to make the following final remark. The vast majori- 
ty of the most serious problems faced by Census planners are the 
management of the estimated 100 million forms and the manage- 
ment of the people that are going to be needed to process them. 
Since the goal of the census is to determine or, to use a statistical 
term, to estimate the population of the United States as accurately 
as possible, it seems obvious that improvement to the management 
problem can be obtained by the use of additional approaches; that 
is, in addition to enumeration. The most obvious approach is to use 
a carefully designed sample together with sophisticated statistical 
techniques in addition to enumeration. 

Since April 1980 there has been a lively discussion in academic, 
legal and political circles concerning the undercount revealed by 
the postcensus enumeration programs of the 1980 census. Much of 
this discussion has focused on whether we should enumerate the 
population or estimate it. I believe that estimation together with 
enumeration can be a highly cost effective solution to the manage- 
ment problems faced by the Census Bureau in 1990. 

Thank you, Mr. Chairman. 

Mr. Ackerman. Thank you very much for your testimony, Dr. 
Eddy, especially for a very graphic analogy. 

[The statement of William F. Eddy follows; also included are his 
responses to written questions:] 




59 



Moy 1, 1986 

STATEMENT 
OF 

WILLIAM F. EDDY 
PROFESSOR OF STATISTICS AT CARNEGIE-MELLON UNIVERSITY 
TO THE 

SUBCOMMITTEE ON CENSUS AND POPULATION 
OF THE 

COMMITTEE ON POST OFFICE AND CIVIL SERVICE 
U.S. HOUSE OF REPRESENTATIVES 

Mr. Chairman, I am pleased that the Chairman and the Subcommittee maintain a 
continued interest in the preparatory work of the Census Bureau for the 1990 
Docennial Census and I am pleased to have the opportunity to appear before you 
today to comment on that preparatory work. My comments are divided into four 
broad parts: the first part relates specifically to the use of an automated address 
file; the second part relates specifically to the automated check-in during the conduct 
of the 1990 Census; the third part .elates specifically to the use of concurrent 
processing during the 1990 Census; and the fourth part relates more generally to the 
use of computers and advanced technology in the conduct of the Census In the year 
2000. 

I would like to make the following preliminary remark. Although my comments are 
directed at technological issues it is generally my impression that the most serious 
problems faced by Census planners for 1990 are managerial. Because the Census 
Bureau has demonstrated its ability to solve the managerial problems {in previous 
censuses) I think we need to challenge them to address the technological ' problems 
more directly. The potential benefits in reduced costs and mor; accurate results (and 
in less time) could be large. 

1. Automated Address File 

Every form distributed by the Census Bureau for the 1990 Census will have a bar 
code label attached to it; presumably, the bar code labels will be generated by 
computer from the Address Control File. Thus, when a form is returned the bar code 
can be read by automated equipment and the Address Control File can be updated to 
reflect the receipt of the form. This procedure has a number of obvious benefits 
over previous check-in techniques. 

First, it will be easier for the Census Bureau to maintain an up-to-date Address 
Control File; that is. when Bureau personnel learn of errors in the existing file they 



63 < 



60 



2 



con moke chongos quickly, easily, uniformly, ond permanently. This is o vost 
improvement over previous procedures, which involved handwritten modificotions to o 
printed list. On the other hond there is on implied capability to communlcote the 
chonges mode to the locol copy of the Address Control File to other levels of the 
Census Bureau hierrorchy ond presumobly to the centrol copy of the file; I do not 
believe that this capability currently exists or is contemplated. 

The Address Control File is developed in cooperation with the Postal Service 
(USPS). I presume that the USPS has the ability to give a fairly precise list of the 
residents of any particular address and I am surprised that Census Bureau plans do 
not include the acquisition of this information. The potential savings in later keying 
of names are vast. 

Second, check-in can be handled by automated equipment capable of reading the bar 
code label attached to the form, greatly re ducing the clerical burden. The Bureau 
already has plans to use multi-pocket laser sorters and/or laser hand wands of the 
type that exist in many major retail stores today. I believe that the use of hand 
wands should be limited to forms that cannot be processed by the automated 
equipment but the Census Bureau plans are a step in the right direction. 

2. Automated Check-in 

One of the documents sent to me for review contained the following {I can only 
hope, rhetorical) question: 

• Wou/d persona/ computers used in wanding have more use after 1990 than 
/aser sorters? 

I think it is clear that personal computers are more likely to be useful after the 
Census than are laser sorters. In the 1986 Test Census the personal computers 
referred to in the question are IBM PC/XT machines. I feel compelled to point out 
that the IBM PC technology was introduced in 1981 and by 1991, when the personal 
computers are available for other use, will be ten years old. Most serious owners 
of computers amortize their hardware over a period of four years and even the 
Internal Revenue Service only requires it be amortized over five years. The notion 
that an IBM PC/XT will have any value to the Census Bureau in 1991 is ludicrous. 

The Census Bureau has clearly decided to separate the processing associated with 
check-in from the processing associated with the gathering of information from the 
forms. From a management point of view this is clearly desirable. This is the stage 
where the confidentiality of the information on the forms is most weakly protected; 




61 



3 



the actual check-in procedure is much simpler than the date-entry step. On the other 
hand, It appears that there will be some work which is repeated in the two stages, In 
particular, the keying of names appears to be duplicated. 

3. Concurrent Processing 

I am surprised to discover that, in the past, the Census Bureau has done no actual 
processing of the returned census forms until the check-in was complete, Again, 
from a management point of view this is the easy way. On the other hand the need 
for quality control seems to me to require processing of the forms at essentially the 
same time as the check-in. I am very pleased to see that the Bureau is moving 
towards concurrent processing for the 1990 Census and I can only encourage: more 
of the same. 

I find it extremely difficult to understand why the Census Bureau finds it necessary 
to sort the paper forms. Surely it is more efficient to enter the data from the 
forms Into a computer in the order the form are received and do the sorting with a 
computer. If it is essential to be able to recover a particular paper form, its 
position in the arrival order could be attached as a data item to the computer record 
for a particular form. I believe that the desire to sort the forms is just a 
bureaucratic habit and should be dispensed with; I challenge the Census Bureau to 
show that it is cost-effective to sort the forms. 

4. Computers and the Census In the 21 st Century 

I know that we all agree that the Decennial Census is the single most important 
statistical activity of the Federal government and that our Census Bureau has a long 
and honored tradition of doing the best possible job for the lowest possible cost, 
Along the way the Census Bureau has been a harbinger of computer technology for 
the Federal government and the nation as a whole. Based on my understanding of 
the planning for the 1990 Census, this leadership role is changing. I fear that we 
may end up using 1970 technology in the Census for the year 2000, I would like to 
suggest a few ideas which bear further exploration in an effort to move the Census 
into the technological future (which will be the present or even the past by the year 
2000). 

The purpose of using computer technology in the Census is to speed up all the 
activities; as I am sure you know, even with the introduction of the Hollerith punched 
card machines, the 1890 Census still took nearly a decade to be completed, I am 
able to identify several major data processing activities (other than the actual 



61-902 0-86-3 




62 



4 



manipulation of data that is traditionally called data processing) associated with the 
Census which can and should rely heavily on modern computer technology; those are: 

'T dato collection; 

2. data transfer; 

3. data storage and retrieval. 

4.1. Da\p Collection 

A review of the Census Bureau planning documents made available to me indicates 
absolutely no attempt to collect the raw data by use of computer technology. While 
it is easy to argue that it is impossible to make use of computer technology in the 
one hundred million homes that must be enumerated. I would challenge the Bureau to 
reconsider this matter for the years 2000 and beyond. 

The United States has the world's largest and most widely available data 
communications network, the telephone system. A gigantic savings in manpower 
could be achieved if each respondent were to telephone the collection office and 
enter their responses directly into Census Bureau computers. The technology exists 
today: auto-answer modems, push-button telephones, pre-recorded messages, 
computer-controlled dialogs, etc. Such a procedure does not even require that a 
respondent have a telephone, only access to a telephone. A further advantage, from 
my point of view, to having respondents initiate the call is that the impersonal 
nature of human-computer interaction is reduced somewhat; of course, experiments 
with such a procedure might show that quality of the data is enhanced by having the 
Bureau initiate the calls with autodial equipment. 

In what appears to me to be a very closely related matter, the government of 
France (which, I believe, owns the French telephone company, PTT) has had a plan to 
give away visual display terminals to every telephone subscriber. The improved 
accuracy of Census respondents who can visualize questions and answers to 
telephone queries seems obvious. This whole idea of having respondents respond by 
means other than pencil and paper requires that the Census Bureau relinquish the 
notion of having everything on paper or microfilm. Banks, which are unarguably at 
least as concerned with privacy and preservation of information as the Census 
Bureau, have been using computer technology for data entry and cash withdrawal for 
many years. Of course, the transition to a "cashless" society is taking much longer 
than its proponents thought; on the other hand. I. personally, have not entered a bank 
in the last several years. 




63 



5 



4.2. Doto Transfer 

In my review of Census Bureau planning documents I was unable to find any hint 
of o plan to use modern technology for data transfer among the various offices. In 
the Processing Manual for the 1986 Test Census, (Volume IV, Chapter 1, Attachment 
A) even the Census Bureau itself has recognized how outdated its methods are; in the 
flow-chart describing the Basic Urban Processing, the symbol used to indicate the 
transfer of data appears to be a railroad ore car of the type used in mines in the fast 
century. 

I believe that the Census Bureau is planning to use magnetic tape recorded at 6250 
bpi as its information transfer technology. This basic technology has existed for 
thirty or more years (with a density increase factor of twenty over that time period). 
This has the obvious benef its of being extremely reliable and familiar and the 
ovbious drawback of being extremely slow; data transfer from coast to coast takes 
days. 

Computer communication networks are ubiquitous in the United States and have 
been for a decade or more. These range from local area networks which are 
typically privately owned and cover a small geographic area (a few square miles) to 
wide area networks which typically are leased from the phone company and are used 
for long-haul data transmission (over thousands of miles). I would like to mention 
two specific examples, both developed and supported by Federal government 
agencies. 

About two decades ago the Advanced Research Projects Agency in the Department 
of Defense started development of a private packet-switching network for 
communication among its contractors and itself. Over time this network has 
developed into a world-wide data communications system with transmission times 
that are typically measured in hours. The network is composed of a large number of 
fairly short interconnected links typically operating at speeds of 56 Kb/second and is 
used for the transmission of large numbers of relatively small pieces of data. The 
stability of the ARPAnet as a communications system is legendary. The system is 
composed of redundant paths and sophisticated hardware and software; I believe 
there have only been two system-wide outages ever and the down-time during each 
outage was measured in hours. From the point of view of its users the ARPAnet is 
a tremedous success. 

The National Science Foundation, as a part of its development of national 
supercomputer capability, is creating a high bandwidth pocket -switching netwo^ to 




64 



6 

join Its fivo suporcomputor centers scattered across the country and to the large 
number of resoarchors who will need access to those centers. The network will bo 
operational later this year. The Important point to mention here Is that there are a 
fairly small number of individual point-to-point links, connecting centers hundreds or 
thousands of miles apart, each capable of transmitting data at ratos up to 1.5 
Mb/second. According to a crude calculation, the entire Census data collection could 
be transmitted over a single link of this kind in a matter of months. 

4.3. Data Storage and Retrieval 

In a perfect system, the Consus Bureau microfilm system (i.e., the Film Optical 
Sensing Device for Input to Computers or FOSDIC) would be irrelevant; the data 
would be entered directly into a computer and could be stored on magnetic media or 
on loser disks. FOSDIC is one more example of the Bureau's predisposition toward 
"tried-and-true" technology. FOSDIC was originally developed in the mid 1950's for 
the 1960 census. In 1990 it will be refurbished and called FACT90 and wilt be thirty- 
five years old. In 2000, FOSDIC will be FORTY-FIVE YEARS OLD! Surely the Census 
Bureau must be laying plans to move from its paper-based methods and its paper- 
processing techniques to something a little more modern. Sadly, I find no evidence 
to suggest this is true. 

It is a truism that personnel costs are increasing and equipment costs are 
decreasing. In slightly different terms, the cost of services is going up and the cost 
of goods is going down. If this is true, then any organization which engages in a 
people-intensive activity should be searching hard for ways to substitute goods for 
services. Computers and the associated technologies are an obvious place to begin. 
I believe that the Census Bureau must take a serious look at Optical Character 
Recognition (OCR) technology for use in future Censuses. It is certainly too late to 
incorporate this technology in the 1990 Census but it is not to early to begin 
thinking and experimenting with its use for the year 2000. The USPS has, I believe, 
extensive experience with OCR technology in its ZIP*4 (the nine digit zipcode) 
program. 

In 1983, when I was consultant to a USPS vendor. OCR technology was already 
being used to assign nine digit zipcodes to individual pieces of mail and mark them 
with bar codes using inkjets. I believe that by now the USPS has purchased 
hundreds {or, perhaps, thousands), of machines which are capable of reading addresses 
and assigning nine digit zipcodes to the addresses for more than 80 percen* of the 
mail pieces with less than two percent error; all this happens at r/-tes like 25.000 



68 



65 



7 



pieces per hour per machine. My recollection is that these rates apply to 
"collection" mail (that is, the stuff they find in mailboxes) and that higher rates apply 
to various kinds of mass and machine-generated mailings. The remaining twenty 
percent that cannot be identified by the OCR equipment is set aside for human 
processing. As I recall the most difficult cases were addresses written with green 
felt-tip pen on green envelopes. 

I would like to make the following final remark. The vast majority of the most 
serious problems faced by Census planners are the management of the estimated one 
hundred million forms to be processed (and the management of the people needed to 
process them). Since the goal of the Census is to determine (or, if I may use a 
statistical word, estimate) the population of the United States as accurately as 
possible, it seems obvious to me that an improvement to the management problem 
can be gained by another approach in addition to enumeration. The obvious approach 
is to use a carefully designed sample together with sophisticated statistical 
estimation techniques in addition to enumeration. Since April 1980, there has been a 
lively discussion in academic, legal and political circle? concerning the "undercount" 
revealed by the various post-census enumeration programs of the 1980 Census. Much 
of this discussion has focussed on whether we should enumerate the population or 
estimate it. I believe that estimation together with enumeration can be a highly 
cost-effective solution to the management problems faced by the Census Bureau in 
1990 and beyond. 




66 



1 



RESPONSES BY WILLIAM F. EDDY TO 
QUESTIONS POSED BY CHAIRMAN GARCIA FOR 
THE RECORD OF THE HEARING OF MAY 1. 1986 



1. Question: You have presented a very forward looking piece of testimony looking 
ahead to the year 2000. Considering how important advance planning is, that is 
certainly a laudable approach. But I wonder, how much of what you described is 
currently possible? What changes do you think that the Census Bureau couIg adopt 
for the 1990 Census that would implement your ideas with currently available 
technology? 

Answer The short answer is that all of my suggestions are currently possible; 
the technology exists today. Unfortunately, the management problems implied by 
attempts to implement the technology are rhyriad; and given the Census Bureau 
predisposition to resist changing a "working" system it may be impossible to 
implement any of my suggestions for 1990. 

The long answer is that two of my specific suggestions realistically can and should 
be implemented for the 1990 Census: 

1. the use of local computers rather than humans to sort questionnaires; anij 

2. the use of a communications network to transmit data to central 



The Processing Manual for the 1986 Test Census, Volume IV, Chapter 1 entitled 
"Urban Processing Overview" clearly indicates that the Census Bureau intends to sort 
the returned questionnaites into CO/CBNA,Block order following processing. There is 
absolutely no reason to perform this hand sorting. Since each questionnaire will have 
a unique laser-readable barcode attached to It. later searching, matching operations, 
or evaluations can easily recover a particular questionnaire from storage without prior 
sorting. This can be accomplished by numbering the returned questionnaires in order 
of arrival; while this numbering does not actually have to be marked on the returned 
questionnaires it probably should be. The numbering could be performed 



computers. 




67 



2 



automatically by the multi-pocket loser reader /sorters by the addition to them of 
Inkjet barcode writers. Having the number written as a barcode means that it could 
be subsequently read by the same laser sorter, if needed. If both the original 
barcode (which determines the physical location of the respondent) and the additional 
barcode (which determines the order of arrival and hence the ultimate storage 
location) are both entered into a computer (which, again, could be done 
automatically), then it will be possible to recover any particular questionnaire by 
merely 'performing a computer sort and select operation. 

In my written statement I described, in some detail, two existing computer 
communication networks which were developed by Federal government agencies. I 
believe that the Census Bureau could achieve considerable speed-up In data 
transmission time with identical accuracy by the development and use of Its own 
computer communication network. Given the organizational plans (a number of 
processing offices and a smaller number of district offices)* it probably makes sense 
for the Bureau to plan on a hierarchical network co'rresponding precisely to the 
organizational structure; this is actually further justified by the data flow* which* for 
the questionnaires* will flow up the hierarchy; the point is that there will be little* if 
any, data flow between processing offices or between district offices. The largest 
number of communication links (those joining the processing offices to the district 
offices) do not need to have a large bandwidth and hence will have lower cost; the 
smallest number of links (those joining the district offices to Census Bureau central 
computer facilities) need the iargest bandwidth but the smaller number of them will 
keep the cost down. 

2. Question: In your testimony you criticized the Census Bureau's plans for the use of 
its equipment after the census. What do you think that the Census Bureau should do 
with its equipment after the census is completed? In an operation that will only last 
a few days, how can you justify spending money on expensive equipment that will 
not be of use after the operation is over? 




68 



Answer The Census Bureau plans to use two general kinds of equipment in the 
1990 Census: 

1. highly specialized custom-built equipment with no other known uses; and 

2. general purpose off-the-shelf equipment which is in wide-spread use. 

The Census Bureau is planning to use a 24-pocket laser reader/sorter and a smaller 6- 
pocket laser reader/sorter during its processing of the returned questionnaires. This 
equipment .will be built under a spe::ial contract to the Census Bureau and cannot be 
used for any other purpose. I ftr.d <t very difficult to understand why the Census 
Bureau has not devoted a r\..'.;*y t \3 effort to find a way to avoid the use of such 
equipment for such a short period. It appears to me from the cost and processing- 
rate data supplied by the Census Bureau that it is not cost effective to use this 
custom-built equipment and it would be less expensive to hire and train individuals to 
use hand wands for the purpose of questionnaire check-in. 

Off-the-shelf equipment which is in wide-spread use should absolutely not be 
purchased for short-term use. It is economical to rent/lease such equipment because 
there are others who can make use of it. Furthermore, the particular equipment that 
the Bureau apparently plans to use will be technologically obsolete by the time of Its 
availability for other uses; I would hope that by 1991 the Census Bureau had moved 
its operations to more modern equipment. 

Dr. Daniel Horvitz, in his written statement for the record of this hearing, has 
suggested that instead of conducting this massive once-every-ten-years effort the 
Census Bureau could conduct a continuing census which covered roughly one percent 
of the population each month. I heart ly endore his suggestion, for it solves a 
myriad of dilemmas faced by the Census Bureau, not the least of which is what to 
do with specialized equipment. Under the continuing census there will be a 
continuing need for specialized equipment. This not only provides a better 
justification for building the equipment in the first place but also provides a ionger 
period of use to amortize its cost. It is fairly obvious that there will be many other 
cost savings associated with a continuing census. Most of the activities associated 
with the decennial census will not be performed as one-shot activities with the 
attendant start-up costs but rather will be ongoing activities. 




69 



Mr. Ackerman. We will now hear from Dr. Judith Rowe, associ- 
ate director for research services, Princeton University Computing 
Center. 

Dr. Rowe. 



Ms. Rowe. I have written testimony I would like to submit into 
the record. 

Mr. Ackerman. Without objection, so ordered. 
Ms. Rowe. Thank you. 

I appreciate your kind introduction and I appreciate your invita- 
tion to provide testimony on the subject of the data products from 
the 1990 census and their dissemination. I think it is worth noting 
that among my credentials are several years as a representative to 
the Council of Professional Associations on Federal Statistics 
[COPAFS], including two as COPAFS chairman. 

Since its early years, the Bureau of the Census has made data 
available to the public frcm each decennial census. Until 1960 all 
of these data were released in the form of printed reports. Al- 
though the formal 1960 publications program included only printed 
reports, early in the decade summary data for census tracts were 
made available on computer tapes for the 43 States which then 
contained census tracts. In the mid-1960's 1 in 1,000 and 1 in 10,000 
public-use samples were released on an experimental basis in the 
form of 200,000, or approximately 100 boxes, of punched cards. In 
1970 computer data products were incorporated into the regular 
publication program. These data were provided in two fashions: in 
the form of counts which corresponded to the 1980 summary tape 
files, and in the form of public use samples. The former, the counts, 
contained tables or aggregate data for areas as small as city blocks 
and as large as metropolitan areas, States and regions. The latter, 
the public-use samples, contained records written on computer 
tapes for 1 in every 100 households and for all of the people in 
those households, using 6 different samples. 

In 1980, building on their 1970 experience, the Bureau of the 
Census rechristened their computer products, changing the counts 
to summary tape files and the public use samples to PUMS or 
public use microdata samples. In addition, data were written on 
tape in a more compact and efficient format; micrfiche products 
were produced from several of the summer tape files; a major 
printed report, the block statistics, was issued only in microfiche, 
supplemented by summar tape files; the quality of documentation 
for the computer files improved markedly; and the Bureau entered, 
and left, the software business. 

During the decade since plans were made for 1980 census prod- 
ucts there have been major technological changes in the computer 
world, namely the advent of microcomputers, and the prospects for 
additional changes are upon us. The Bureau's reputation as a pub- 
lisher and distributor of data from the decennial censuses is im- 
pressive, both in the United States and abroad. However, in plan- 
ning products from the 1990 census, the Bureau is faced with many 
unknowns which will affect the choice of the physical formats in 
which data will be delivered. The Bureau has already been in- 



STATEMENT OF JUDITH ROWE 




70 



volved for some time in an extensive program of public meetings 
from which they hope to learn about the product needs of their 
user community. Although this is a commendable project, there are 
three problems with it: one, the census using community grows 
from census to census, each time supplementing the experienced 
users with the novices; two, many, if not most, of the people par- 
ticipating in these meetings are recommending that the Bureau 
produce products appropriate for today's technology, not for the 
technology of 1992 and beyond; and three, many of the people who 
are most outspoken in describing their product needs have very 
narrow interests in terms either of subject matter, of geographic 
area or of geographic unit. I have heard a report of the Washington 
Census Products meeting and it seemed to have produced several 
sensible recommendations. One which I would particularly support 
is reducing from 100,000 the size of the areas which can be identi- 
fied in the public use microdata samples or PUMS. In New Jersey, 
as in many other states, this would allow for the identification of 
all counties and would therefore reduce the need for detailed sum- 
mary tables in either printed or machine-readable form. 

When 1990 comes, we at Princeton, along with people at several 
other institutions will be providing access to our fourth decennial 
census in machine-readable form. There are others for whom this 
will be their first census. Although new users may lack a familiari- 
ty with basic census concepts, with the structure of census prod- 
ucts, or with the ability to move around comfortably within the 
machine-readable products, careful thought should be given to how 
much training must be provided by the Bureau and how much can 
reasonably be provided by others. For example, if decennial census 
data in machine-readable form is distributed as part of the deposi- 
tory library program, who will train the depository librarians? 

The technology for providing data to users is changing rapidly. 
Although many users today are requesting that the Bureau make 
data available in the form of floppy diskettes, it is already appar- 
ent that this is an inefficient way to make large amounts of data 
available. For example, a recent request for Summary Tape File 1, 
the smallest STF, for a single country in New Jersey, Morris, re- 
quired 32 diskettes. It seems more likely that by 1992 it will be 
more appropriate, efficient and inexpensive to download from 
online services such as CENDATA or from local services provided 
by State data centers or special services provided by commercial 
vendors. Alternatively, users who are currently using microfiche or 
having tables printed out from summary tapes may find access to 
CD-ROM the most suitable solution. 

The Bureau has had many years of experience in melding togeth- 
er the diverse needs of users. I trust they will be able to compro- 
mise these needs and that those who advise them will not be in- 
timidated by those who yell the loudest. 

Some decisions, however, are affected by other considerations 
and it is in these areas that the Bureau has already made some 
sound recommendations, these are likely to lead to the early re- 
lease of data. For example, by omitting from the first reports data 
from statistical areas which are created on the basis of the data 
collected by the census and for multi-State areas, it will be possible 



74 



71 



to make data for governmental units available sooner. Delaying the 
publication of historical data will also contribute to this goal. 

Not only do users want particular subject matter, for particular 
areas in appropriate formats delivered promptly, but they want 
that data to be correct. It is a monumental task to insure that all 
of the census products are released without error, but this should 
be a primary goal. If there are errors in the census data, no one 
else can correct them. The Bureau of the Census must focus on 
those tasks which only they can do and leave tasks which can be 
done equally well to the State Data Centers and to the commercial 
sector. 

For example, except perhaps for a simple retrieval package, the 
Bureau should not write software. By producing generalized data 
dictionaries, the Bureau can save other software developers much 
time and effort, and provide equally for all of them. 

The Bureau should produce general purpose data products in a 
variety of formats which can be used by others to produce special- 
purpose products to meet individual needs. For example, the 
Bureau should not produce custom diskettes from summary tapes 
anymore than they have produced tape extracts from summary 
tapes. 

I look forward eagerly to the 1990 census and to the products 
which will be produced from it. It is exciting to see the growing use 
of decennial census data and the varied purposes for which it is 
used. If the taking of the census is a national ceremony, surely the 
data which result are a national treasure. 

Thank you. 

Mr. Ackerman. Thank you, Doctor, for your testimony and your 
complete testimony is entered into the record. 

[The statement of Judith S. Rowe and her response to written 
questions follow:] 



72 



STATEMENT OF JUDITH S. ROWE , PRINCETON UNIVERSITY 



COMPUTING CENTER 



INTRODUCTION 

We take for granted the quality, the quantity and the ease 
with which we access decennial census data. The data 
products created by the United States Bureau of the Census 
have become a model for other federal statistical agencies 
as well as for statistical agencies throughout the world. 
But even models sometimes have flaws. 

In the past decade the data user community has grown in both 
size and diversity. Although we * 'Id all subscribe to a 
recommendation that data be accui J, prompt, useful anH 
usable, we do not all mean the same things by such a 
recommendation. As Ed Pryor of Stati sties Canada noted in 
addressing the Second Census Research Conference, "We cannot 
assume that everyone who uses data advances at the same 
pace." Pryor referred to the "fast data approach," 
comparing it to the fast food approach, and contrasted it to 
the indepth needs of research demographers. 

We must also contrast the substantive data needs of users in 
different parts of the country. For example, I can live 
with data errors in Wyoming, but data users in Wyoming 
cannot. I can live without detailed ethnic breakdowns for 
census tracts, but the city of New York cannot. I can live 
without detailed data on American Indians but the Bureau of 
Indian Affairs cannot. The decisions concerning products 
from the 1990 census, which will be addressed first, focus 
more on format than on content. Specific content decisions 
will come later . 

Although neither format or content as such, it is important 
to note that the Bureau must invest more effort than it did 
in 1980 in insuring that the data are accurate before they 
are released. It is true that everyone wants data to be 
released promptly but .not at the expense of accuracy. It is 
important to avoid tfc** public proliferation of tapes with 
errors , as in the case of Sumiiiary Tape Fi le 2 , for which a 
correction tape was issued, or Summary Tape File 3 A, for 
which replacement tapes were issued. We have no way of 
knowing hov: many purchasers actually corrected i:heir STF 2 
files or replaced their STF 3A files and, in any case, I 
would guess that the efforts to correct these errors were 
more costly, both to the Bureau and to the user community, 
than extra pre-release verification would have been. 

The past decade has witnessed enormous technological changes 
in storage media and in computing devices and there are, 
therefore, far more options for producing data products than 
ever before . It was easier to make recommendations in 1976 
and it is likely that there will be sufficient stabilization 
in computer storage technology to make it easier again in 



1996. 




73 



I remember serving on a committee which was making data 
product recommendations for the 1980 census . Although we 
recommended some modifications in the 1970 products we did 
not recommend new formats for providing data. In 1982 our 
recommendations stil 1 seemed reasonable . I hope our 1986 
recommendations wi 11 fare equally well . In order for thi s 
to happen we must anticipate today the formats which will be 
commonplace in 1992. 

The Bureau is now ci rculating a paper titled "1990 Census 
Products I s sues" which provides general guidelines and a 
large range of specific questions to which they are seeking 
answers. The paper is based on discussion within the Bureau 
and at numerous public meetings held within the past two 
years, as well as on the advice of advisory committees and 
of individual users and information intermediaries such as 
librarians and State Data Center (SDC) staff . It has been 
no mean feat to mediate the . recommendations of so many 
diverse users, users with diverse informational needs and 
technological capabilities , 

Following a general planning section the paper raises 
questions in four major subject areas. These are: 

1 ) statistical reports, microfiche and tabulation 
contents ; 

2 ) computer tapes , including summary tapes , 
reapportionment/redistricting data, public-use 
microdata files, data for microcomputers and in other 
formats, documentation, special tabulations, and 
software ; 

3) maps and geographic reference products, including 
the Master Area Reference File, census tract and block 
maps, geographic- area equivalency files, and area 
measurements ; and 

4) services to data users, including communications, 
guides, and indexes. 

By and large my comments will be directed., to the Bureau' s 
"issue" paper. The focus of the paper purports to be the 
100% data, the short form questionnaire which must be 
completed in every household, but much of the paper applies 
equally to the sample data. A subsequent paper will 
specifically address the sample data, the full- range of 
housing and socioeconomic questions -which are collected only 
from approximately 20% of the • households in the country 
using the long-form questionnaire. 

I. PLANNING FOR 1990 CENSUS PRODUCTS AND SERVICES 

The Bureau of the Census' issue paper on 1990 data products 
begins with some general guidelines and with a schedule for' 
data delivery. The latter calls for the publication of all 
data products by the middle of 1993. This is an acceptable 




74 



schedule; the question is "what can the Bureau do to adhere 
to it?" The original schedule for 1980 was also an 
acceptable one, but the Bureau fell far short of it. 

Perhaps the most specific and the most imaginative of the 
proposals the Bureau is offering for meeting its schedule is 
that data for governmental units be made available without 
waiting for data for statistical areas. Summaries for 
metropolitan statistical areas (MSA's), urbanized areas or 
census-designated places are based on the data collected by 
the census and cannot be prepared until all of the data for 
each area 1 s component parts are tabulated. Further delays 
will be avoided by postponing publication of data for multi- 
state areas such as MSA's, urbanized areas or Indian 
Reservations . Other commendable ideas for speeding up 
census product production are the separation of historic 
data into separate products and the reduction of the amount 
of race and ethnic data in the standard printed reports. 

There are some plans under consideration for improving fiche 
products and for developing products for the novice user. 
Given the cost and .the proposed attention to new media, I 
would question the value of too much investment in improving 
the quality of the microfiche products . I also fail to 
understand a recommendation, made in another Bureau paper, 
to produce on-demand printed copy from microfiche. Surely 
this is a function of the State Data Centers or of the 
commercial sector. The same goes for products like video 
tape, graphic summaries, or on-demand pamphlets with 
extracts from summary tape products, all proposed as a means 
of serving novice users. Whatever effort is expended on the 
indices for the summary tape products can and should be 
extended to the microfiche. For limited use the Bureau- 
produced fiche is satisfactory and increasing the ease of 
using machine-readable products such as CD-ROM, for example, 
should preclude the need for microfiche in the future. 



II. PRINTED REPORTS AND MICROFICHE 

A. POPULATION AND HOUSING COUNTS FOR GOVERNMENTAL AREAS 

As indicated above, the need for prompt data for 
governmental areas is extremely great. Since MSA's are 
composed of counties, users can at least reconstruct them as 
they previously existed. The general principle of releasing 
available data as soon as possible is excellent and 
providing 100% data for governmental areas without waiting 
for sample data also makes good sense. 

Users often photocopy pages from printed reports. This 
would be expedited by improving the quality of the binding, 
by using gathered, sewed or stapled bindings instead of 
perfect bindings. The use of margins would also improve the 




75 



quality of copies. Many of the printed reports from 
decennial censuses are heavily used and, although hard 
covers would be ideal, better paper covers would be some 
help. 

B. BLOCK STATISTICS REPORTS 

The experience of 1980 indicates that there is no necessity 
for publishing printed block statistics reports . Since 
blocks will now be available nationwide, it seems reasonable 
to produce them by state or county group within state rather 
than by metropolitan area. Although it may be premature to 
make this decision, CD-ROM may well be a felicitous 
substitute for 300,000 frames of fiche. Block data are 
typically used for the study of local areas, seldom for the 
country as a whole. Therefore, for most users the 
avail abi lity of more blocks will not justify any but the 
most trivial loss of subject material. 

The block statistics maps are the basic census maps. It is 
essential that they be available to users in hard copy and 
there seems no reasonable way of doing this except by 
printing them. However, it might be well to substitute the 
smaller size of the 1970 block maps for the larger and 
somewhat more awkward size of the 1980 maps. 

C. COMBINED POPULATION AND HOUSING REPORTS 

It is acceptable to combine 100% housing and population data 
in the printed reports, as has previously been done with the 
summary tapes, and to publish these data prior to the 
availability of either the sample data or the statistical 
area data. However, it is not acceptable to omit data for 
MCDs in 11 states, particularly as in New Jersey, where all 
of those MCDs are governmental units. We did battle on this 
issue after the 1970 census and it was my understanding that 
it had been resolved. I have never been able to explain to 
users why there was printed information for Princeton Boro, 
which was a place, but not for Princeton Township nor why 
Woodbridge Township and Hamilton . Township, neither of them 
places but both -:with populations of over 100, 000 did not 
appear in the 1970 printed reports. I am not comfortable 
with the omission of data for smaller places and I have 
never been able to understand the logic of printing data by 
population size groups.' if you don't already know the size 
of the area, you don't know where to look for information 
about it. It is not necessarily true that the amount of 
information one needs about an area is a function of its 
size. 

D. REPORTS COMBINING 100-PERCENT AND SAMPLE DATA 

It does not seem necessary to separate the publication of 
the 100% census tract data from the publication of the 



79 



76 



sample census tract data in the way that is being proposed 
for larger areas. However, information about the 
availability of the 100% data in non-print formats should be 
made more widely avai lable than was the case in 1980 . In 
1970 Newark Public Library produced a list showing all of 
the 567 municipalities in New Jersey and the tracts which 
each contained, I f we had a comparable list for every state 
in 1990, it wouldn't make any difference in what order the 
tract data were published. Numeric order by county- would be 
fine. The proposal to provide only metropolitan tracts in 
printed form seems acceptable. It would seem reasonable to 
consider much of the race and Spanish-origin detail 
published for tracts in 1980 as candidates for special 
tabulations . Because of the changes in tract boundaries, 
the creation of new tracts, etc. I would not recommend 
providing 1980 population and housing counts; in many cases 
this information will be misleading. I would appreciate 
having land areas and, armed with this and tract population, 
I think most users could compute density by themselves. 

E. HISTORICAL COUNTS AND DATA 

I see no need to publish historical data for either 
metropolitan areas or for tracts. It is certainly useful 
for governmental areas, in both the printed reports and on 
the summary tapes, but no reports or tapes should be delayed 
in order to include these numbers. They can always be 
obtained from the publications of earlier censuses. 

F. MICROFICHE 

Without a good sense of the amount of use the 1980 f iche 
received, and without a sense of the costs involved in 
improving the quality of the fiche, it is difficult to 
comment on the proposal to use a more expensive fiche 
product, one which would generate a substantially greater 
number of frames. It is not hard to read the fiche 
currently in use on the screen and most users do not require 
hard copy . Reader-printers rarely produce easy- to- read 
copies, even of the best fiche, and it seems foolish to 
expend too much money or effort on a transient technology. 
As indicated earlier, the one effort which seems warranted 
is one which would provide, if not an index to the fiche, at 
least some meaningful labeling on the f i che themselves . I f 
the cost is not prohibitive, I would recommend doing both. 

G. RACE AND SPANISH ORIGIN DATA 

It seems appropriate to reduce the amount of subject detail 
cross-tabulated by race and Spanish origin in the regular 
printed reports. The real question will be which tables to 
eliminate. If it is necessary to decrease the number of 
areas for which race and Spanish origin data are produced, 
raising the required population of a group in order for an 



80 



77 



area to qualify for inclusion (e.g. from 400 to a higher 
number) would seem easier for library patrons than switching 
to a percentage system. This is not a burning issue; using 
the Trenton SMSA as an example, raising the minimum from 400 
to 800 would probably reduce the relevant section of a 
report now containing 46 pages plus 47 pages of appendices 
from 18 to 12 . What is this saving in dollars and cents? 
Could this money be better spent on special subject reports? 



Ml. MACHINE- READABLE PRODUCTS 
A. SUMMARY TAPE FILES 

The first machine- readable product is the reapportionment/ 
redistricting file mandated by Public Law 94-171. The 
reapportionment counts are scheduled for release by December 
of 1990; the PL 94-171 file by March of 1991. A survey of 
the state officials who used these data for the 1980 
reapportionment and redistricting reveals that they were 
generally satisfied and that the few problems that they did 
identify seem likely to be addressed by the Bureau's plans 
for 1990, the complete blocking of the entire country and 
the development of the TIGER system to improve the quality 
and consistency of the census geography and maps. 

As with the printed reports , the Bureau is attempting to 
make 100% data available earlier by providing it without 
waiting for the statistical areas, without the addition of 
historical data and without the inclusion of areas which 
cross state boundaries. This is a commendable idea since 
each Summary Tape File (STF) will eventually have a national 
level file which will include these areas. If the Bureau 
plans to produce a printed report with historical data I 
fail to understand why producing a machine- readable one 
would be any more effort. A congressional district file for 
the 102nd Congress would seem to be a useful reference since 
in some states districts will change quite dramatically 
after reapportionment. However, my own recollection is that 
there was actually little call for these data. 

The enormous increase in the number of blocks nationwide 
would certainly argue for separating the blocks from the 
other geographic areas . Using the same table format for 
both would provide the advantages of 1970 when blocks were 
separated and the advantages of 1980 when they were merged 
with the larger areas. 

The proposal to provide two versions of STF2 and of STF4, 
one for total population only, seems quite sensible. The 
majority of users of these files used only the total data. 
In fact much of our use of STF2 would have been eliminated 
if single years of age had been included in STF1 . 




78 



One 1980 initiative which should be continued is the use of 
area names. Although it is useful to have some area 
identifier on each segment of a logical record it is only 
necessary for the first segment to have as long an 
identifier as was used in 1980. My guess is that most 
people use FIPS codes and that the census codes could be 
omitted. Among the other useful additions in 1980 were the 
complete-count variables on the sample files and the 
addition of medians . Another useful addition would be 
median age. It is my impression that we had more requests 
from the 1980 census for data cross-tabulated by age than 
was .the case from earlier censuses. This was over and above 
the new tabulations for people 60 or 65 years and over. 

There is no question that the use of split tracts caused 
numerous problems. The most obvious, but perhaps the most 
difficult, solution to this would be to get rid of them. 
Including summaries for tracts, as well as for BNA 1 s or 
places, that are split by a higher geography would certainly 
help but it is difficult to recommend a subject matter 
deletion which could be made as a trade. 

Although I am sure that there would be some use of an STF 1 
zip code file it does not seem to be a high priority item 
The creation of SPSS "export" files might well antagonize 
two sections of the private sector; the producers of value- 
added census files and the other statistical package 
vendors. I recognize the Bureau's desire to ensure that an 
easy-to-use product is available to the public, but I doubt 
that this is the answer. Certainly it is not a substitute 
for the more generalized format knOw in use. 

B. PUBLIC-USE MICRODATA FILES (PUMS) 

The 5% sample was unquestionably the most heavily used, both 
because of its size and because of its structure. The 
metropolitan sample received somewhat less use, although it 
is possible that had it also been a 5% sample it would have 
been more popular. The urban-rural sample was probably 
almost as unpopular as the 1970 Neighborhood Characteristics 
Sample. I cannot explain the unpopularity of either. There 
is no doubt a market for separate microdata samples for age, 
employment status and a variety of other characteristics as 
well as race and ethnicity. However, it seems more 
appropriate for the Bureau to draw the line at producing 
general purpose samples and then allowing others to select 
special populations from them. Although it does not seem 
necessary to add family records to the PUMS, family sequence 
numbers on the person records would make intra-family as 
well as intra-household analysis possible. 

C. DATA FOR MICROCOMPUTERS 

Microcomputer technology is changing almost daily. It is 



82 



79 



almost impossible to decide today how best to provide data 
for access by microcomputers in 1992. However, if I can 
blue- sky a bit, I would anticipate that in 1992 floppies or 
flexible diskettes will have gone the way of the punch card; 
that laser disk products will be more standard, less 
expensive and therefore more available; that the typical 
hard disk will be many times larger than is the case today 
and that micro-mainframe communication will be ubiquitous . 
I think this is a conservative prophesy, one which we can 
plan for with almost absolute certainty . Being a bit less 
conservative, I would guess that there are J.ikely to b^s far 
more advanced laser products available and that micro- 
mainframe communication will be substantially cheaper. 
Under any circumstances I would assume that in '199 CD-ROM 
will be less expensive to produce and that many, if not 
most , users will have access to the necessary terminology . 
In terms of function I see CD-ROM as a substitute f or 
microfiche and as the source for on-demand publishing. 
Therefore, it is entirely appropriate for the Bureau to 
confine itself to the development of simple display and 
extraction software and to leave the development jt analytic 
tools to the commercial sector. I do not yr»v. see a 
practical substitute for magnetic tape; rather I see it <ts a 
continuing source of extracts tc be downloaded and as the 
medium for large-scale analysis . Although I do not 
anticipate a long-term need for either the Bureau or other 
data distributors producing custom diskettes, there is 
likely to be a period within the iiext three to live years 
during which the two technologies are likely to overlap. 
Unless I am very mistaken this pei. od should be over by the 
time the Bureau is ready to release 1990 census data. 

D. DOCUMENTATION 

The quality of the documentation that has been released with 
the decennial census data is excellent. Although the User 
Note system is awkward unless the documentation is in a, 

loose-leaf format and that presents the mi s sing page 

problem---there seems no better solution. K;y only 
re commendati on is that each edi ti on o f. the docuroi=.nt*vti on 
include an edition number and/or date and the numbers of the 
User Notes it incorporates . If it did not vid excessively 
to publication costs, I would argue for publishing the 
documentation in BOTH bound and loose-leaf form. 

The Bureau should develop data dictionaries in a "generic" 
format that could be converted by users for use with any- 
software . This seems more efficient a?::d less cor»t ly than 
trying to develop separate di ctionarie* for each of the 
software products used with census data . The dictionaries 
should be either sold separately or made available by 
subscription and released on a flow basis. Many users will 
yant to start work with both the documentation and the 
machine-readable dictionaries as soon as they are available; 




80 



others wi 11 have no interest in the accompanying material 
until the data are available and still others will acquire 
their dictionaries from some other source . 

E. SPECIAL TABULATIONS 

My understanding is that the Neighborhood Statistics, the 
EEO (affirmative action) and the School District files were 
all well-used. If this be the case, I suggest they be 
cont nued. As for other types of areas, defined by users in 
terms of census blocks, this is only sensible if it can be 
done efficiently and at a reasonable cost. If standard 
tables are used this should be the case; if users wish to 
produce custom tables for custom areas, the cost to the user 
should increase substantially. 

The idea of producing "subject files" with data for counties 
in much the same way as in previous years "subject reports" 
were produced for regions or states is a good one. It is my 
belief that the subject reports were sorely missed and that 
having data for counties would greatly enhance their 
usefulness. These subject files seem a good candidate for a 
CD-ROM format. 

F. SOFTWARE 

The Bureau should do what the Bureau can do best, and 
writing software is not it. As far as I am concerned 
CENSPAC was an expensive fiasco which I would not like to 
see repeated. I would assume that the software which the 
Bureau develops for internal use will have certain special 
features which, would make it inappropriate for general 
users . 

G. OTHER 

I realize that in order to protect the confidentiality of 
census respondents it is necessary to suppress or modify 
some of the publ i shed data . I know that the re were 
complaints about the flag system which was used to indicate 
suppression on the 1980 summary tape files. However, I find 
the two other solutions of which I am aware less 
satisfactory than the one in current use. The system used 
in 1970 was awkward and confusing and the random rounding 
system being considered requires constant explanation for 
tables whose cells do not add up correctly. Unfortunately 
most data users would rather that numbers add up, ev^n if 
they add up to the wrong total. 

IV. MAPS AND GEOGRAPHIC REFERENCE PRODUCTS 

Although the development of the TIGER file will not 
radically change the needs of the user community for maps 
and other geographic reference products, it should radically 




81 



change the Bureau's ability to produce these products. 
Unfortunately the Bureau's census products issue paper does 
not address these concerns. Obviously there are some 
products which should be produced at any cost and other 
products for which even a small expenditure would be 
unjustified. I am not a map expert 'and therefore I can only 
share with you my own satisfactions and my own annoyances in 
using the Bureau's geographic products. 

I cannot at this point comment on the proposed TIGER 
products . As for the Block Numbered Map Series, my primary 
concern is that theve be block maps available in paper form, 
of a size similar to the 1970 maps, and that preferably they 
be organized by counties. Although the addition of a second 
color to these maps is desirable, it is not essential. 
Additional map features are always a plus; they help you 
locate blocks in unfamiliar areas. However, some reasonable 
compromise is certainly acceptable. I made little use of 
the Reference Maps in the past, but I may not be typical. 
The thematic maps were very attractive but I am not sure 
that their production is really a Bureau function . Armed 
with the appropriate data, any of the many academic or for- 
profit organizations with mapping capabilities could produce 
similar products, perhaps even at moderate cost. The 
ability to do this would, of course, be enhanced by the 
availability of Digital Geographic Area Boundary Files. The 
complete digital files are very large and the Bureau should 
make them available on tape. It might be reasonable to make 
"thinned" state and county files available on diskettes, 
although they are widely available commerci ally. As for 
producing "thinned" files for smaller sub state areas, this 
seems a reasonable function for the State Data Centers. 

The Bureau should continue to produce the Master Area 
Reference File (MARF). It is very convenient to be able to 
get all of the small-area geography for the whole country on 
a single tape. However adding some fe<?y variables such as 
land area to the Summary Tape Fi?.fey will obviate the 
necessity for producing more than one edition of MARF. This 
will be a saving for both the Bureau . and for the data 
centers which acquire its products. I don't feel that I 
have adequate information to comment on the Address 
Reference or Geocoding Files. 

My experiences is that both of the Geographic Reference 
Products were well-used. The Geographic Identification Code 
Scheme (GICS ) is a quick reference for locating MCDs/CCDs, 
and places , and for finding appropriate codes for them, and 
for states, counties and metropolitan areas . Although 
Equivalency Filej between censuses have some value, it is 
difficult to explain the nature of- boundary changes in 
tabular form. The same is true for equivalencies within 
censuses as in the cases of school districts, neighborhoods 
and other similar districts. I would welcome an imaginative 




82 



way of dealing with these problems. On the other hand, it 
is enormously useful to have in summary form the places in a 
given county group area or the tracts in a given county. 

V. SERVICES TO DATA USERS 

A. STATE DATA CENTER PROGRAM 

Princeton and Rutgers Universities have been providing 
access to machine-readable census data since the early 
1970' s and, through the aegis of the Joint Committee on 
Printing' s Depository Library Program, to printed census 
material since the 1860's and even earlier. The JCP is 
exploring the addition of data in electronic format to the 
Depository Library Program. It is likely that by the tima 
data are released from the 1990 census not only Princeton 
and Rutgers but all state and major public, academic and law 
libraries will be developing the facility for making 
machine-readable data available to their patrons, probably 
via microcomputers. It behooves the State Data Centers to 
prepare for a greater involvement of libraries in the area 
of data delivery. I am pleased to see that libraries are 
already being involved in State Data Center meetings. In 
addition, however, the major libraries, many of whom are 
already actively involved in the SDC program should think 
more seriously about how to serve the small public and 
school libraries better. 

B. COMMUNICATIONS 

All of the Bureau of the Census' current newsletters and 
fliers have well-defined purpose"; and well-defined 
audiences. There seems no reason to produce an additional 
newsletter. Data User News is the genrral-purpose, large 
audience newsletter which reaches individuals interested in 
any or all of the Bureau's products. Monthly Product 
Announcement provides quack information on new Census 
products of all types, those already released and thoso soon 
to be announced. Fact Finders for the Nation are excellent 
brief reference tools which describe all of the sources of 
information, regardless of format, on a given subject and 
are brief and inexpensive enough to be acquired in quantity. 
The chart on pages 6 and 7 in the revision of the Factfinder 
for the Nation on "Data for jmaXl Communities." is an 
excellent model for futu re Factfinder products. Data 
Developments provides a dcta service or library with 
abstracts for all o.f the Bureau's machine-readable products, 
as they are released. This makes it unnecessary to acquire 
the full technical documentation or codebook unless the data 
themselves are being actjuired. CENDATA can be used as a 
means of providing prompter information about new products. 
Handouts describing each of these information sources as 
well as the Users' Guide and the Indexes could provide 
guidance in their use. This would increase their use, 




83 



create more informed users, and fewer inquirier. to -jus 
offices . 

C. 1990 CENSUS USERS' GUIDE AND INDEXES 

A Census Users* Guide is essential for users of census 
products. However, it is important that as much of the 
as possible be made available in 1990 along with 
ske Letons of the data products, a glossary and indexes . I 
would prefer having a subscription to the Guide and 
receiving replacement pages as they are issued. However, I 
reco^r.'ize that for libraries and other places at which they 
provide more public access to their publications this is 
difficult. Could we have it both ways? 

D. CENDATA 

CENDATA is developing a growing audience , one which wil 1 
increase as it becomes avai lable through less expensive 
services . It can be used effectively to provide new data 
for large areas as they become available, new product 
announcements and, if possible, as an ordering service. 
CENDATA offers a useful means for providing on-line access 
to national data, data for states, metropolitan areas and 
large cities. I question its cost-effectiveness for smaller 
areas . 



CODA 

We stand at an exciting time as we view the prospect of a 
growing body of users of federal information, a substantial 
portion of whom are users of data from the decennial census. 
In an age of technological change we see the role of the 
computer as a means of producing these products as well as a 
tool for using many of them. The Bureau of the Census' 
efforts to poll the public for their advice in the design of 
the products which will emanate from the 1990 census is 
commendable. We've come a long way from the single volume 
which summarized the early censuses. 



87 



84 



Princeton University 
Computing Center 
87 Prospect Avenue 
Princeton, NJ 08544 



May 8, 1986 



Congressman Robert Garcia 

U.S. House of Representatives 

Committee on Post Office and Civil Service 

Subcommittee on Census and Population 

219 Cannon House Office Building 

Washington, DC 20515 

Dear Congressman Garcia, 

I welcome the opportunity of responding to your thoughtful questions. I 
am sorry that you were not able to be at the hearing and I look forward 
to meeting with you at another time. However, I did appreciate the 
opportunity of appearing before Congresswoman Oakar and Congressman 
Ackerman. 

1 am enclosing a clean copv of my oral testimony. The answers to your 
questions appear below. 

1. I understand your concern about my advocating the use of CD-ROM as a 
medium for providing access to the 1990 census data and I shsi-P your 
concern for end users. However, it is my belief that the Stnte Data 
Center program should serve as the intermediary between the Burc/iu of 
the Census and the occasional user of small amounts of census data. It 
is to these services that a user should go for data in hard copy or even 
diskette form. In addition, I should point out that six years is a very 
long time in terms of changes in computer technology, particularly in 
terms of changes in microcomputer technology. I have reviewed x\i-- 
recommendation with colleagues in both the computer field rend ho 
library field and they do not find it premature. I view CIViW 
primarily as a substitute for microfiche which will provide cW,e~ 
easier to use copies of improved quality. In addition, proposal ■■ .^c*-= • v 
the Joint Committee on Printing concerning providing >6p.-.r.t* , 
libraries with data in electronic format will add furthAi^centivt ' . 
the library acquisition of CD-ROM rsade" < appropriate printers. Tne 

law covering depository libraries wiU w users cf these facilities 

to acquire census data at the cot of x«\\:;:tx.- - te-;. 

2 I realize that the testimony I pro ic ., ,y tl fc , r racia ^ 
and ethnic minority groups was aroh' gvi«. l :, . : hdlieve. iV« the 1990 
census should include the same number of ^ueKtion:; about these groups, 
that the same amount should be published for governmental units and for 
larger statistical or other ce: <S us-desigr.atad areas. My only 
recommendation was that we might, perhaps, increase the minimum racial 
or ethnic population in census tracts, e.g., foi which we publish 



88 



85 



PAGE 2 

tables. Since the average size of a census tract is about 4,000 people 
I thought we might up the minimum from 400 to 800, i.e. from 10%, of the 
population to 20% of the population. This would save a nontrivial 
amount of paper and the data would be available for special tabulations. 
I do not believe the data for the smaller ethnic and racial populations 
is sufficiently used to warrent its being available in hard copy. 

Again, I thank you for the invitation to testify. I trust that you have 
found my testimony of value in evaluating the Bureau of the Census 1 
proposals for providing public access to the data from the 1990 census. 




Judith S. Rowe 
Associate Director for 
Research Services 



JSRrBWJ . 
Enclosure 



89 



86 



Mr. Ackerman. We will now hear from Dr. Benjamin F. King, 
the director of survey methods, Educational Testing Service. 
Dr. King, welcome. 

STATEMENT OF BENJAMIN F. KING, DIRECTOR, SURVEY 
METHODS, EDUCATIONAL TESTING SERVICE 

Mr. King. Thank you, Mr. Chairman. 

Although seemingly a prosaic technical issue, the compilation of 
accurate lists of addresses for the structures that contain the hous- 
ing units and other quarters where the population resides is the 
very backbone of census operations. A good list is essential for good 
coverage, and good coverage is in turn an important factor in 
achieving a successful enumeration. 

The several tests of address list compilation activities discussed 
today are, I believe, excellent examples of the type of quasi ^ r peri- 
ment that is most appropriate for the precensus years. Although 
there might possibly be a higher motivational effect when U.S. 
Postal Service employees are engaged in the real thing, I think 
that to be unlikely. Most importantly, there is no major element of 
cooperation by the general public that is required for fuii success of 
the experiment. I contrast this kind of situation with those of other 
tests of census procedures involving public response. An example is 
the recent test in Jersey City of the two-stage census procedure. As 
you probably know, the response rate in Jersey City was, I believe, 
38 percent. There was a similarly low response in Tampa. That is, 
to the request for mail return, about 55 percent, I believe; and 
when this sort of phenomenon occurs, it is very difficult to inter- 
pret the results of these tests because the issue under consideration 
depends so much on public cooperation. 

And, as an aside, I think that more of those kinds of tests should 
be embedded in the decennial census because without that the con- 
ditions are just not sufficiently realistic to provide easily interpret- 
ed results. 

Back to the address list compilation procedures. The summary 
report is based on four preliminary research and evaluation memo- 
randa. These are referred to as PREM's, and they are PREM No. 
28, entitled "Results and Analysis of the Urban Address List Com- 
pilation Test," and PREM No. 38, dealing with the rural address 
list compilation test; PREM No. 12, unit by unit precanvass find- 
ings; and finally PREM No. 6, the "Results of the Advanced Post 
Office Check [APOC] II in the 1985 Pretest." 

The order in which I just read these is the order in which I will 
briefly discuss them. 

The impetus behind the 1984 compilation tests, both urban and 
rural, was the GAO report entitled "A $4 Billion Census in 1990? A 
Timely Decision." And that report recommended the investigation 
of the use of lists compiled by the U.S. Postal Service as the start- 
ing point for the sequence of update procedures that results in the 
final census frame. With respect to the basic design of the urban 
test, I find the Bureau to have been quite thorough in comparing 
the various combinations of starting lists— the vendor lists, he 
Post Office lists, the 1980 census list— and update methods— de- 
pendent canvass and the eat ing check. 



90 



87 



I agree that it does not make sense to combine the casing check, 
which is a U.S. Postal Service operation, with the Postal Service 
starting list in laying out the experimental design. Furthermore, 
the Bureau should be commended for its frank discussion in PREM 
No. 28 of the shortcomings of the test, and I think the Bureau is 
very open about this. 

For example, it clearly acknowledges the restriction of inferences 
to the two purposively selected test sites, Hartford and Bridgeport. 
This is always a problem in these tests. It would be wonderful if we 
could do the test in many different cities, but we have to select one 
or two. They are usually selected, or it seems at least, for worse- 
case characteristics, but unfortunately they turn out to have best- 
case characteristics for other aspects. So it leads to difficulty in in- 
terpretation. 

With all of this openness and candor though, I am puzzled by one 
statement. This is in section LP. of the PREM 28. I quote, . . this 
is not to imply . . . that there is not interest in knowing how these 
results would compare in other areas; areas with a higher growth 
rate, for example. On the contrary, the Bureau is presently exam- 
ining ways in which this can be accomplished." That is the end of 
the quote. 

If this statement is implying that there is some way to achieve 
this without actually testing the other areas, I should be very in- 
terested to find out how that is to be accomplished. 

Anyway, in spite of the weaknesses, the results of this test are 
compelling and conclusive. Although the USPS list has high initial 
coverages compared to the vendor list, the cost after updating is so 
great that it cannot compete with the other two methods of listing. 
It appears that in its September 1986 decision, coming up, the 
Bureau will choose the basic approach used in 1980, that is, vendor 
lists followed by an Advance Post Office Check and then followed 
by dependent canvass, casing, and time cf delivery checks. There 
are three Post Office checks and one census dependent canvass. 
The update procedures, however, are likely to be improved accord- 
ing to the findings from the 1985 and later census pretests. 

In areas with poor vendor coverage, as Dr. Keane mentioned 
today, the address list from the 1980 census will serve as the most 
cost-effective start, and this is particularly relevant for certain 
intercity areas, for example, where the vendors themselves have no 
intention of going in and trying to create lists, at least not in the 
near future. 

In my opinion these implied decisions are good decisions, but the 
Bureau also properly observes that the 1980 census list will not be 
as adequate in 1990 as it was in 1984 because of the obvious hous- 
ing changes during the intervening 6 years. 

Briefly, in the rural tests in Texas and in Georgia the Post Office 
listing was clearly inferior and cost ineffective because of errors in 
geography and map-related problems. The Bureau sees no chance 
of the USPS procedures being sufficiently improved nor of costs be- 
coming sufficiently lower in time for the 1990 census and I concur 
with this judgment. 

I also concur with the Bureau's optimistic view concerning the 
positive returns to be expected from continuing collaboration with 
the U.S. Postal Service as we move into the era of automated ad- 



91 



88 



dress files and automated navigation systems for enhancement of 
geocoding. 

Now a brief word about the unit-by-unit precanvass. As described 
in PREM No. 12, the new precanvass procedures that were tested 
in the 1985 census in Jersey City and Tampa appear to have result- 
ed in better coverage of individual apartments in multiunit build- 
ings. In 1985 the enumerator did a unit-by-unit canvass within 
each apartment building whereas in 1980 information on number 
of units was usually obtained from a manager or some similar 
source. The additional cost of the unit-by-unit canvass would 
appear to be exceeded by the gains in correct apartment identifica- 
tion. Another finding is that the majority of apartment corrections 
in the unit-by-unit precanvass are for addresses that were classified 
in the Advance Post Office Check as deliverable without needing 
corrections, and this apparently confirms the need for a portfolio of 
update procedures rather than reliance on a single method. 

There was another test — I mentioned the Advance Post Office 
Check II — in Jersey City and in Tampa, and with respect to experi- 
mental design this is the most interesting because of the clever 
method of salting addresses that were classified in the original 
APOC as undeliverable with a random sample of deliverable ad- 
dresses to see if the U.S. Postal Service is really doing its job on 
checking on the original classification. It is encouraging to see that 
the results show no evidence of widespread Postal Service careless- 
ness or r^bberstamping in the second check. 

The findings indicate, however, that the classification of an ad- 
dress as undeliverable in two checks is not sufficient evidence to 
delete the address, and thus the second Advance Post Office Check 
does not appear to be required. In other words, no matter what it 
accomplishes, it still is not enough to convince that an undelivera- 
ble is undeliverable. 

Unfortunately, further analysis from 1985 of the combination of 
precanvass and Post Office determinations of status still does not 
indicate that addresses can be deleted before the followup checks. 
Less conservative reviewers might not agree with the implied 
Bureau decisions on these matters, but I think that caution in 
making dramatic changes to the present procedures is appropriate 
in this case. 

In conclusion, I believe that in its 1986 and future activities the 
Bureau is pursuing a vigorous program of testing methods to im- 
prove address lists. The continued use of the unit-by-unit precan- 
vass in Los Angeles County, the Advance Post Office Check and 
reconciliation in rural Mississippi, and the test involving an auto- 
mated address list in one Mississippi County are high priority 
items. I believe that they are in accord with the recommendation of 
the Committee on National Statistics Panel on Decennial Census 
Methodology that scarce resources for testing need be applied to 
only the most promising coverage techniques. In other words, these 
ave promising and they should be worked on. 

I also look forward to the test in 1987 of the use of vendor lists as 
the starting point for address compiling in rural areas, which 
depend now on census prelisting. If my remarks today seem to be a 
blanket endorsement and excessively laudatory, it is only because 
Bureau personnel have been so thorough and innovative in their 



92 



90 



May 16, 1986 



The Honorable Robert Garcia 

United States House of Representat ives 

Washington, DC 20515 

Dear Mr. Garcia: 

This letter is in response to your letter of May 5, 1986 in which 
you thank me for testifying at the May 1 hearing on processing the 
1990 Census- I, in turn, should like to thank you for the invitation 
and for the honor and privilege of appearing before your subcommit- 
tee. I hope that my testimony will be of help in your analysis and 
decision-making concerning Census operations. 

With respect to the questions in the attachment to your letter, 
my response is as follows: 

1. I examined all of the dociu.u?nts concerning the address list 
compilation tests that were sent to me befure my testimony, and in 
addition, I read the summary report of those activities that was the 
basis for presentations by Bureau personnel at the recent Census 
Research Conference and at the April 1986 meeting of the Census 
Advisory Committee of the American Statistical Association. Assuming 
that the description of the results and the cost factors cited in 
those documents are accurate, I concur completely with the.- conclu- 
sions of the Bureau concerning the relative effectiveness of the 
various methods of compilation. I believe that the Bureau was fair 
and objective in its design and evaluation of the several experiments, 
and that full opportunity was provided in those experiments for the 
demonstration of the advantages of r jew methods of compilation if such 
advantages did in fact exist. To repeat a point that I made in my 
oral testimony, the address compilation tests were ideal for precen- 
sus execution because the results did not depend on public cooper- 
ation. 

2. In contrast to the address compilation experiments, other 
precensus tests — for example, the test of the two-stage census 
methodology — depended heavily for their successful execution on the 
public's behaving much the same as it would have in a full-scale 
decennial census. In the 1985 Jersey City Test Census the mail 
return rate was only 38 percent, very much lower than the rates 
experienced in the worst districts of the 1980 Census- The low level 
of cooperation was probably due to a number of interacting factors, 
among them the fact that the operation was simply not "the real 
thing," and thus the spirit of participation in a national ceremony was 
absent. With the poor response rate in general, proponents of the 
two-stage approach to data collection will be able to criticize the 
Bureau's conclusion of failure of the method on the ground" that 
conditions were not sufficiently representative of true Census 
operations — implying that the proposed two-stage methoti did not 
have a fair chancn to display its advantages. 




91 



I do not agree with the above-mentioned critics of the Bureau's 
conclusions, but my reasons for disagreement lie in a strong ,irior 
belief that the two-stage process would not work, not in any great 
reliance on the results of the 1985 Census test. The Panel on 
Decennial Census Methodology of the Committee on National Statistics, 
of which I am a mamber, recommended in its 1984 interim report that 
the experiment with twa-stage operations not be included in the 1985 
Jersey City pretest, and that scarce resources be applied to other 
tests for which the outcomes were less certain a priori. The point 
that I wish to make here is not that the Census was remiss in its 
execution of the test, once the decision was made to include it in the 
1985 operation, but rather that it was not a good idea to do the 
test in the first place because of the high prior probability of 
failure; and if one does not agree with that viewpoint, that the test 
should have been embedded in the Decennial Census rather than in a 
precensus operation with low likelihood of widespread public cooper- 
ation. It follows that this last point of criticism applies to any 
test for which unambiguous interpretation of the results depends on 
a high level of public compliance with the request for data. 

3. With respect to the last question about possible Census use 
of other government files to reduce its task, I have done very little 
research on this matter, and I can only give you an opinion based on 
my general experience in studying Census operations and on my 
professional experience as a teacher and practitioner of statistics. 
The essence of statistical estimation and inference in many practical 
applications involves the use of information on certain variables to 
reduce uncertainty about the values of other variables of interest. 
The Census task of counting and estimating the characteristics of the 
population is no different in that regard, and thus, if there were no 
concerns about confidentiality, the use of information from other 
governmental agencies would be of great value in attaining a better 
count, as well as more accurate measures of person, household, and 
family characteristics. Family income, for example, would certainly be 
better measured if the files of the Internal Revenue Service could be 
used routinely to provide that information instead of having to elicit 
it in the long-form Census schedule? and there would be considerable 
ultimate cost savings, as well as reduction of burden on respondents. 

I would be the last person, however, to suggest that such a 
merger of data could be easily accomplished without raising public 
suspicion about invasion of privacy. Yet, if the public believes that 
the information that it provides to the Bureau is adequately pro- 
tected from disclosure at the personal level, and if it places similar 
trust in the IRS, there is no reason in principle that it should not 
place equal trust in a joint Census-IRS use of the same data. The 
problem is one of figuring out how to achieve that level of trust, 
while accomplishing the desired end — i.e., more accurate measurement. 
I do not have the solution. I earnestly hope, however, that the 
attitude in the future of all branches of government — executive, 
legislative, and judicial — will be to try to find ways to 
facilitate the merger of agency files for the purpose of more 
efficient and accurate measurement, while protecting our rights to 
privacy, rath*?- than taking the position that any cross-agency use of 
files will necessarily lead to damage of our rights. 



Sincerely yours, 




92 



Mr. Ackerman. As the Chair previously suggested, I would like 
to ask Dr. Keane to join the table at this time and to share with us 
some of his thoughts and responses to the testimony that we heard, 
and the Chair would like to advise that we do have a limited 
amount of time, so we will keep the free-for-all down to a mini- 
mum. 

Mr. Keane. Thank you for that opportunity, Mr. Chairman. 

I should identify my colleague, Peter Bounpane, who is known to 
this subcommittee because he usually accompanies me to hearings 
like this, and I am glad he is here. I would like him next to me, but 
we cannot have everything. 

Mr. Ackerman. We could arrange that. Peter, if you would like 
to move your chair to the other side. 

Mr. Keane. I will start talking while they are moving, and Hoth 
of us would like to comment so that may help the future discus- 
sion, and we will be to the point. Mine are a bit n^or^ generic than 
Peter's would be. 

I would like to talk, first of all, about the gilc and scope of an 
activity the size of a decennial census in this country, and we have 
quite a bit of comparative experience over time as well as with 
other countries. For instance, in Saudi Arabia there are about 10 
million people. I just came back from Egypt Where they are doing a 
census of about 50 million people. When you are talking about a 
quarter of 1 billion people — quarter of 1 billion people — it is a new 
contest. Things that work in large surveys do not necessarily work 
here. 

For instance, I had a conversation with American Telephone & 
Telegraph — predivestiture. They mail — and this is relevant to us— 
3.1 million annual reports. When I told them about the Census 
Bureau mailing 30 times that many, how would they handle it, 
with what kind of technology, their eyes rolled. So as high technol- 
ogy an organization as AT&T would not know how to handle the 
kind of mailings that faced us in the past and will face us in 1990. 

To elaborate a little further, when we are talking about some- 
thing like a census done for a tenth of the country over a 10-year 
span, we immediately get into cost considerations that are already 
sensitive now, that could be extraordinarily so with the kind of 
money and budgeting involved; and there is also an issue of the 
constitutionality of doing something like that. And we know how 
difficult it is to make an amendment to the Constitution. So those 
are the kind of considerations. 

We are very much enriched by this kind of a hearing and these 
kinds of comments. All of the individuals here are known to us and 
we value their comments. 

I would tack on four considerations when it comes to automation 
of the 1990 census. One is the hardware and particularly the instal- 
lation and maintenance obstacles, challenges posed in the decen- 
tralized kind of mode which we discussed. Related to that is soft- 
ware development, enormous software development, the kind of 
leadtime that requires; and also the acceptable space, trying to ac- 
quire that in the places and in sufficient amounts that are afford- 
able to us. These things have not been mentioned, the space, for 
instance. The staffing has and that is the final of the four points. 



96 



93 



To get the number of managerial and technical people in the lo- 
cations needed, to get them trained, to get them in synchronization 
with each other and with the other four areas pose monumental 
challenges. 

So I summarize by saying this is the largest Federal Government 
program in the sense it touches more people than anything else 
this Government does, and that means that it is in a special catego- 
ry by it' elf. Things that work elsewhere in other large surveys or 
other countries -or in past times do not necessarily work here. That 
is why if we seem a bit cautious at times it is with good reason. 

If I may turn to my colleague. Thank you. 

Mr. Ackerman. Thank you, Doctor. 

Mr. Bounpane. Thank you, Mr. Chairman. 

I am not going to address all the issues I heard raised, in the in- 
terests of time, but let me just pick a few I thought were key — and 
some that recurred among the four — and try to say just a few 
words about those. 

The first is the general question of "Have we gone far enough in 
automating 1990?' Picking up a little bit on what Jack just said, 
this is a very difficult question and, obviously, people have differ- 
ent opinions about that particular issue. To some, it may seem our 
eventual choice was perhaps too conservative, that we could have 
been further along the continuum than we actually are. 

In some respects the answer to that is, "Yes, we were physically 
able to be a little further along that continuum." There are ma- 
chines that can do the kinds of things that people have said here 
this morning. The question is can they be purchased, implemented, 
tested and work properly in a huge one-time activity like the 
census? It comes very quickly. It is over very quickly, and then it is 
done. You have to do something with the equipment at the end of 
the census, and our eventual choice considered those factors as well 
as what is physically available on the market at the present time. 
And there are differences of opinion about this. We think we took 
the choice that maximizes the use of equipment available to us but 
gives us the highest probability of no failure in 1990. That would be 
very bad, to have a fully automated census, be relying on it, and 
push the button in April 1990 and have it not work. 

We tried to balance that risk against the risk of not going far 
enough. This difference of opinion that exists here at this table also 
exists with many people within the Census Bureau. It is a difficult 
choice. 

One thing that should also be mentioned here is our reluctance 
to use multiple systems within the census, and in general that is 
correct. We like to have one way to do things and to use that one 
way throughout the country. That is because we have to hire a 
large temporary work force, trair them quickly and ask them to do 
this job. The more exceptions ^ a have to the rules, the more diffi- 
cult it is to manage. And sc 3 Bill Eddy pointed out, it would be 
very possible to collect certL:;i information over the phone, have 
the respondent call in rather than fill the questionnaire in by 
pencil. We were convinced that not enough people could do that to 
make that the sole collection technique and, therefore, we use 
paper. We do, however, allow respondents to call in in some in- 
stances, perhaps not to the extent he thinks we ought to. 



61-902. 0 - 86 -4 



.97 



94 



A question that was raised by Dan Horvitz was, "Should we use 
more computer-assisted telephone interviewing?" Here again, once 
we had to develop a system to use the paper questionnaires, the 
overhead costs of introducing a whole new CATI system just to 
handle those people who failed edit — that is, did not answer all the 
questions they were supposed to — was pretty large. We judged that 
it is not worth that investment for that small a universe, not that 
we do not want to use telephone interviewing and not that the ben- 
efits that were pointed out are not really there. 

Our question is, "Can we get a technique in place and manage it 
in a large-scale census?" We are not always so stubborn on having 
only one system. Let me point out one example already raised here 
and that is, "How do you check in the questionnaire with a bar 
code?" As we pointed out, it could be done one of two ways: A laser 
sorter, which is simply a machine that reads bar codes, or a simple 
wand run across the bar code. Laser sorters are extremely fast. As 
many as 30,000 questionnaires an hour can go through one of these 
machines. It is very helpful, but they are very expensive. No after- 
life after the census is a major consideration. Using a small person- 
al computer with a wand attached is much slower, but that kind of 
a machine has tremendous use after the census, and so we struck a 
balance. We have decided to use laser sorters in a chunk of the 
country, where we think we need to check in very fast, but, in the 
balance of the country we will use small computers with a wand, 
and they will have tremendous use after the census. So occasional- 
ly we do bend this rule. 

Also I should point out 

Mr. Ackerman. Could I ask a question at this point concerning 
the economy. I think I hear from what you are saying that the 
preferable method would be the laser use except for the fact that 
there is a one-time cost and no afterlife. Would it not be possible 
for these to be leased or for arrangements to be made to buy and 
resell them afterward? Has anybody investigated the cost compari- 
son that way? 

Mr. Bounpane. Yes; we have looked at that. So far we have not 
been able to find them to be leased. Perhaps it is possible, but no 
one has been willing to lease them to us because they have to be 
built to order; and that being the case, we have not found an after- 
census purchaser for them at this point in time. 

I might point out, Mr. Chairman, laser sorters cost about a quar- 
ter of a million a machine. 

Mr. Ackerman. How many would you need? 

Mr. Bounpane. Depending on how you did this, as many as per- 
haps 30, 40. 

Mr. Ackerman. That is $7V2 million, if my math is correct. 
Mr. Bounpane. Yes. 

Mr. Ackerman. What does the labor cost of all those people with 
the magic wands come to? 

Mr. Bounpane. I do not know the answer. It is probably more. 

Mr. Horvitz. According to the number the Bureau presented in 
the technical reports associated with this area, the cost trade-off 
was much in favor of th^ wand and personnel rather than the laser 
sorters, so I do not understand the Bureau's still stroii£ interest in 
laser sorters. 



98 



95 



Mr. Bounpane. The reason we are interested in laser sorters is 
that, in certain parts of the country, it is very difficult to get 
enough people to do this job where we have to do it in a short 
amount of time, and it is better to use the available people for 
other things. For example, in an area like New York, where we 
want to hire enough people to make sure we have enumerated ev- 
eryone, it might be very beneficial to have a laser sorter there so 
we can free up as many human resources as possible for the most 
difficult task, which is the enumeration. That is the kind of reason- 
ing we went through. In some parts of the country it makes sense. 

Mr. Ackerman. Anybody is Lee to respond if you want at any 
time. 

Mr. Bounpane. I thought the suggestions on the year 2000 were 
very good, and we will certainly look into those. It seems like a 
long time between censuses and there should be plenty of time to 
plan for the next one, but our experience is generally there is not. 
Even though we do have test censuses, it is hard to learn every- 
thing we would like to learn to plan the next one, and so we should 
probably make more use of each census to learn for future cen- 
suses. 

Many of the suggestions that were made we would very much 
like to try in the 1990 census to see if it is possible to use them in 
future activities. So we are going to consider nonpaper collection 
technologies, allowing people to call in and use the computer, and 
optical character recognition. Tthere seems to be some problem 
with OCR when you have handwritten letters as opposed to just 
numbers, but at least we will experiment with that in 1990 so we 
will have information on those technologies to use in planning for 
the 2000 census. 

The issue of transmitting data over the phone lines came up in 
several people's remarks, and we have looked at that. That is defi- 
nitely possible, to transmit information over phone lines from de- 
centralized locations to centralized locations. We have decided, 
however, not to do that for two reasons: The first reason is our ex- 
amination of the process says that moving the data is not our real 
hangup — at the point we actually have the data collected, moving 
it physically is not our real problem. The second reason, which is 
also very, very important, is that we were very worried about put- 
ting data into phone lines, even with encryption devices, since that 
might allow someone to intercept it. It would only take one person 
to do that to cause a significant problem. The alternative is to 
hard-wire with our own dedicated lines, but the cost of that would 
be just enormous. 

A couple of other thoughts. I thought Judith made some very 
good comments about the products, and I would say that our plan- 
ning is definitely in the direction of more CD-ROM type activities 
and less micro, floppies and microfiche. Although we get stuck in a 
dilemma here, that our products must serve a wide range of users 
and those of us in the room are very familiar with electronic de- 
vices, but there are people in the United States who are not at the 
same level and they want census data too. So again we have to find 
a compromise which may not be as high a level as some may wish. 

And a couple of words on the edit, because that also came up 
fairly strongly, that it is disappointing to some that we are not 



93 



96 



doing an automated edit throughout the entire country, We would 
also like to have done that We agree that a hand edit has difficul- 
ties. That is why we moved towards concurrent processing; We 
would like to have been able to do automated edit nationwide. Our 
judgment is, however, we cannot accomplish that. 

The point I would like to make is that even in those areas of the 
country where we are planning a hand edit we also plan to do a 
computerized, automated edit later at a more relaxed time period. 

Before I close, 1 would like to elaborate on something Ben King 
said. I know it is not the purpose of this hearing to talk about 
public cooperation, but Ben made that point and I think it is a very 
good one. These are good issues here and we should talk about 
them and should resolve them, but the census is eventually going 
to be dependent on people's willingness to cooperate with it, and 
that is an important issue and we would like to spend as much 
time on that as on some of these others. 

These are some of the thoughts I jotted down as I listened to the 
testimony. Thank you, Mr. Chairman. 

Mr. Ackerman. Thank you very much. 

Dr. Eddy. 

Mr. Eddy. I would like to speak to the optical character recogni- 
tion issue just briefly. Several years ago I was a consultant to a 
USPS vendor and they were at that time delivering to the USPS 
sorting machines that used optical character recognition technolo- 
gy, and it is my understanding that they were able to not only read 
the ZIP Codes but actually city names and states and they were 
able to look up these in various tables and so forth at rates compa- 
rable to the rates for laser sorters that Peter just mentioned. 

I think it is probably impossible to implement this sort of thing 
for 1990, and I would not encourage it; but by the year 2000 I 
would presume— I do not want to say it is trivial, but it certainly 
should be quite doable by then, and I would like to strongly encour- 
age it. 

On a different note, I am really a little concerned with the rapid- 
ity with which the Census Bureau wants to dismiss the notion of 
electronic transmission of the data. They may or may not be aware 
that there are other organizations in this country which are more 
concerned about security than they are, and they seem to be able 
to make use of the public telephone network. I am thinking not 
only of various federal government agencies such as the Depart- 
ment of Defense, but rather p ublic corporations such as commer- 
cial banks which transmit hundreds of millions of dollars through 
the public telephone system on a regular basis, and they do use en- 
cryption. There are various encryption standards; and if they are 
good enough for banks and for the Department of Defense, I am 
certain they are good enough for the Census. 

Thank you. 

Mr. Ackerman. Thank you, Doctor. 

Mr. Bounpane. Just to add one thing, that is a reasonable point 
and I did not want to leave the impression we were doing nothing 
on phone lines. We do transmit certain information and are doing 
some experiments on some others. The question was whether the 
whole data file should be transmitted over phone lines. 



100 



97 



Mr. Keane. I have a comment of Professor Eddy's last point. If 
you are referring to the intelligence community as someone else, 
for instance, they probably do not have our mandate which is to 
provide their data to virtually everybody. It might be quite the op- 
posite. So that is one consideration. 

Another consideration is those commercial firms that do use the 
electronic dissemination and how far we go as a supplier and how 
far they would like to see us go and what the relative efficiencies 
are, that that is something that from a strategic planning stand- 
point the Census Bureau is concerned with right now. 

And, finally, there is always a cost consideration in the sense of 
what is the most efficient medium and, using all the media with 
which we might disseminate data, how do we get a complementary 
balance. 

Mr. Ackerman. Yes, Dr. Horvitz. 

Mr. Horvitz. I just wanted to comment that I had made a sug- 
gestion that it is time to consider some census alternatives than 
one that mobilizes every 10 years. It seems to me the comments we 
heard here from Mr. Keane and Peter Bounpane suggest, in fact, 
the problem of logistics is a very real problem in organizing the 
census and determining where the offices are to be located, how 
many offices, what kinds of people to hire, the problems of training 
those people. Those are major issues, and they seem to stand in the 
way, in fact, of certain decisions that would move the Bureau 
closer towards a level of automation that I certainly feel was 
achievable for the 1990 census and which from my standpoint now 
is not going to be even approached. 

Mr. Ackerman. Thank you very much. 

Mr. Horvitz. I have not finished. I appreciate the opportunity to 
just say a few more words. 

Mr. Ackerman. In that case I won't thank you for a while. 
[Laughter.] 

Mr. Horvitz. Jack Keane raised the issue of the apportionment 
requirement, and I very readily recognize that that is an issue. 
What I am suggesting is it is time to look at alternatives to what 
we are doing and have been doing since 1790. I do not think that 
tradition ought to necessarily hold in a world that is changing with 
technology and in other ways so rapidly as it is. 

Now, if there is a census alternative and it stands up to all of the 
requirements that one would put on a census, then it seems to me 
that given that census alternative it is time then to examine the 
issue of apportionment and whether, in fact, there is an alternative 
to our current ways of deciding apportionment. I certainly would 
not suggest we abandon what we are doing now in terms of appor- 
tionment, but in the face of a decent alternative I think we should 
seriously examine that issue. 

So I think really that the Bureau ought to find what are the best 
alternatives for the United States and in the public interest and 
leave the problem of apportionment to the Congress, given the 
pressure that they would be under in the Congress to consider the 
alternative. 

I think that is all. 

Mr. Ackerman. Thank you. 



101 



Let me thank the Director and the entire panel for their partici- 
pation with us this morning, for the expert testimony, for the re- 
sponses, for the presentations on behalf of the committee and espe- 
cially Chairman Garcia. We would like the participants, if they 
would be willing, to be prepared to answer in the future any ques- 
tions that the chairman or the subcommittee might direct to you so 
that we may have those included in the record as well. Thank you 
all very much. 

To summarize the experts' remarks and to react to the Bureau's 
entire plan, we have Prof. Stephen Fienberg who is the Maurice 
Falk Professor of Statistics and Social Science at Carnegie Mellon 
University. Dr. Fienberg is the current chairman of the Committee 
on National Statistics of the National Academy of Sciences, and as 
such he has been instrumental in reviewing the work of the Census 
Bureau conducted by the Academy. He is also vice president of the 
American Statistical Association. 

Dr. Fienberg, welcome. 

STATEMENT OF STEPHEN E. FIENBERG, MAURICE FALK PROFES- 
SOR OF STATISTICS AND SOCIAL SCIENCE AT CARNEGIE 
MELLON UNIVERSITY 

Mr. Fienberg. Thank you, Mr. Chairman. I was getting lonely 
back there being the only witness excluded from the table. 

It is a pleasure for me to appear before the subcommittee and to 
participate in its review of the Census Bureau's plans for 1990. I 
have been following their planning effort now for some time with 
great interest, in part because of my personal research activities 
and in part because of my activities as chairman of the Committee 
on National Statistics. The committee h?s had a special independ- 
ent panel commissioned by the Bureau which has been examining 
the methodology for 1990. The panel issues a report last fall enti- 
tled "The Bicentennial Census: "New Directions for Methodology in 
1990," and we made copies available to the subcommittee staff and 
to the subcommittee members. Some of the panel's comments and 
recommendations are germane to the topics that care the focus of 
today's hearing as well as to the subcommittee's ongoing responsi- 
bilities. 

My comments today are going to be focused in three areas: One 
is just a reminder about the extent of the long-range aspects of the 
census planning; the second is going to be an attempt at a very 
broad and sweeping summary of the remarks and testimony of the 
other witnesses today; and then finally I want to mention addition- 
al areas of census planning for 1990 that I believe require outside 
scrutiny and comment as well as congressional oversight and guid- 
ance. 

Information gathered as part of the decennial census is used not 
only for purposes of congressional reapportionment and for State 
and local redisricting but also for the distribution of billions of dol- 
lars of Federal funds and for a host of other government and non- 
government purposes. With so much riding on the outcome of the 
decennial census, we should not be surprised at the extent of plan- 
ning required. Census taking is a massive enterprise. Planning for 
1990 officially began in the fall of 1983 with the appropriation for 



102 



99 



fiscal year 1984, but the real planning began much earlier, well 
before the 1980 census was actually taken. That planning program 
included several experiments and post-enumeration studies de- 
signed to help develop and improve methodology for subsequent 
censuses, including the 1990 census. 

The need for such long-range planning efforts underscores both 
the strength and the weakness of the Bureau's current activities. 

Mr. Ackerman. Dr. Fienberg, let me just, if I might, advise you 
there is a vote presently in progress so that you can properly pace 
yourself and we would not have to adjourn and reconvene. You are 
going to have a total cf about 7 minutes. 

Mr. Fienberg. I will be done. 

Today's hearing focused on four very specific technical aspects of 
the methodology for 1990, and in each area you have heard about 
the extent of the long-range plans of the Census Bureau staff and 
the step-by-step evaluations in which they are currently engaged. 
These efforts are consistent with the Census Bureau s publicly 
stated minimum goals which are: first, to conduct the census with- 
out increasing the per housing unit cost in 1980 dollars; second, to 
expedite the availability of the data to users; and, third, and I want 
to emphasize the third one, to maintain a high rate of overall cov- 
erage and to improve the accuracy of small area data. 

Improved address lists and automation are two of the keys to the 
first two goals I enumerated, but by focusing on them the Bureau 
has implicitly assumed these will help achieve goal three. 

Today's presentations have not really provided any direct sup- 
port for such an assumption. I continue to be concerned about over- 
all coverage, undercoverage in selected population groups and 
areas, and the accuracy of small area data. 

The outside experts who examined the Census Bureau plans on 
processing procedures and commented on them this morning have 
all praised the Bureau for its efforts, and I concur in that praise. 
But some of these experts have also expressed concern that the 
1990 census will be done with 1970 and 1980 technology. That is, 
there is too much satisfaction with the status quo and not a rapid 
enough movement toward techniques and approaches already in 
widespread use outside the Bureau. 

As an example, I would take the IBM PC/XT, the personal com- 
puter that is a cornerstone in some of the automation procedures. 
We acquired several of those a few years back in my department at 
Carnegie Mellon and they have since been discarded for new com- 
puters. The IBM PC/XT that was sitting in my office is used pri- 
marily by my children rather than by the professionals working at 
Carnegie Mellon University. 

These experts have also expressed the need for research on new 
approaches to census taking if we are to have quality and cost ef- 
fective population data in the future. I note that comments have 
much in common with the recommendations of the committee's 
panel. In many ways the panel took as its starting point the struc- 
ture for the 1990 decennial would resemble that for the 1980 and 
the panel supported the Bureau's approach to address list compil- 
ing and the variety of automation procedures that we heard de- 
scribed today. 



103 



100 



The panel then focused on four other topics: Plans for research 
and ongoing experimentation with new methodology in 1990 and 
beyond; improvements in questionnaire content and evaluation of 
questionnaire changes; coverage improvement evaluation; and esti- 
mation and the adjustment of census data for undercount and over- 
count. Each of these areas is critical to the successful completion of 
the 1990 census, and each has been the focus of research and eval- 
uation activities at the Bureau. 

If you consider, for example, the problem of estimation and ad- 
justment, you will see the nature of the concern and the impor- 
tance. The Bureau itself has raised the issue of whether the census 
ought to be viewed as a count, an estimation effort, or some combi- 
nation of the two. Many of us outside the Bureau have come to 
think about census taking as a statistical estimation problem, but 
we agree with Census Bureau staff that the census effort should be 
some combination of estimation and counting. 

The issue is what combination? Even though census data will not 
be gathered for another 4 years, the Bureau needs to set in p]ace in 
the near future, that is, in the next year or two, operational ver- 
sions of adjustment procedures if the full value of census adjust- 
ments and estimations is to be realized. This means a firm timeta- 
ble must be established that allows for outside scrutiny and com- 
ment and for congressional oversight and guidance. 

Prof. William Kruskal, a former colleague of mine at the Univer- 
sity of Chicago, remarked that: "The decennial census is a national 
ceremony and a symbol of the relationship between citizen and 
government." But if we are to go beyond the ceremony and the 
symbolism, we need to look carefully at the uses of decennial 
census data and we need to make a professional evaluation of the 
methodology for data acquisition and analysis. 

This subcommittee's ongoing review and oversight of the Census 
Bureau plans are critical in the decisions on methodology for the 
1990 census These decisions will help to shape the information 
base that will guide Government policy into the 21st century. It 
has been my privilege to help the subcommittee in this effort. 

[The statement of Stephen E. Fienberg and his response to writ- 
ten questions follow:] 



104 



101 



May 1, 1986 

STATEMENT 
OF 

STEPHEN E. FIENBERG 
MAURICE FALK PROFESSOR OF STATISTICS AND SOCIAL SCIENCE AT 
CARNEGIE MELLON UNIVERSITY 
TO THE 

SUBCOMMITTEE ON CENSUS AND POPULATION 
OF THE 

COMMITTEE ON POST OFFICE AND CIVIL SERVICE 
U.S. HOUSE OF REPRESENTATIVES 

Mr, Chairman, it is a pleasure for me to appear before your Subcommittee again, 
and to participate in your review of the Census Bureau's plans for the 1990 Decennial 
Census. I have been following this planning effort with great interest, in part 
because of my personal research activities, and In part through my position as 
Chairman of the Committee on National Statistics, at the National Academy of 
Sc iences, The Committee has had a special pane!, commissioned by the Bureau of 
the Census, whic'i has been examining the methodological planning for 1990. The 
Panel issued a report last fall entitled: The Bicentennial Census: New Directions for 
Methodology in 1990, and some of the Panel's comments and recommendations are 
germane to the topics which are the focus of today's hearing, as well as to topics 
that may be the subject of future hearings. 

My comments are divided into three parts; the first part is a reminder of the 
extent and long-range aspects of census planning; the second part is my attempted 
synthesis of the remarks and formal testimony of the other witnesses; and the third 
part focusses on additional areas of census planning for 1990 that I believe require 
outside scrutiny and cor ment as well as congressional oversight and guidance. 
CENSUS PLANNING 

The information gathered as ptrt of the decennial census is used, not only for 
purposes of congressional reapportionment and for state and local redistricting, but 
also for the distribution of billions of dollars of Federal funrJs, for a host of other 
Federal government needs, and for a vanet> of purposes bv 'eseaichers. planners, 



* ) " 

105 



102 



and decision makers in business, state and local governments, and academic 
institutions. With :>o much riding on the outcome of the decennial census, we should 
not be surprised ax the extent of planning roquired -- after all. census taking is a 
massive enterprise. 

Planning for the 1990 Census officially began in the fall of 1983 with the 
appropriation for fiscal year 1984, but the real planning began much earlier, well 
before the 1980 census was actually token. The planning program for that census 
included several expeiiments on d post-enumeration studies, designed to help develop 
improved methodology for subsequent censuses including that of 1990. The need for 
such long-range planning efforts underscores both the strength and the weakness of 
Bureau of the Census's current activities. As was noted by the Panel on Decennial 
Census Methodology: 

To the general public and many casual users of census data, it may 
appear that the Census Bureau has ample time to plan wisely for the 1990 
Census, given the start of the planning process more than six years prior to 
Census Day. April 1, 1990. and the foundation of research already completed 
in connection with prior censuses. In fact, as a review of the Census 
Bureau's field test schedule for 1990 indicates, there are relatively few 
opportunities to test thoroughly changes or modifications to census 
procedures, particularly if the changes represent major departures from the 
past. Moreover, evaluation of the likely impact of important changes Is 
hampered by the fact that pretests cannot adequately assess the effects of 
alternative procedures on public cooperation with the census — only tests 
conducted under census conditions, that is, experiments incorporated into an 
actual census as disrinct from pretests, can fully address this Important 
question. 



In addition to the compressed time schedule for testing and research, two 
other critical factors affect the ability of the Census Bureau to modify 
census methodology: staff and budget resources. The Census Bureau has 
long been known for the high quality and dedication of its technical staff. 
The current bjdget for research on decennial census methodology, 
particularly for research on the undercount, t s large by the standards of 
earlier censuses. Nevertheless, no agency of government, particularly m the 
constrained world of the 1980s, can expect to have sufficient staff or 
resources to try out more than a few promising ideas and concepts. 
Pressures in the next few years to reduce the federal government's large 
deficit may make it more than usually difficult to obtain adequate staff and 
funding to carry out a thorough research and testing program for 1990. 

7 he Btcentennta/ Census (1985. pp.4-5) 



106 



103 



3 



TODAY'S HEARING 

Today's hoaring hos focussod on four tochnicol aspects of tho methodology for 
1990; (I) address llsl compilation, (ii) automated address control file and automatod 
check -in, (HI) concurrent data processing, and (iv) data products and processing. In 
oach area, you have heard about the extensive end long-range plans dovolopod by 
Consus Bureau staff, and the step-by-step evaluations in which they are currently 
engaged. These efforts are consistent with the Consus Bureau's publicly-statod 
minimum goals to 

(a) Conduct the 1990 Census without increasing the per-housing-unit cost in 
1980 dollars, (b) Expedite the availability of the data to users, (c) Maintain a 
high rate of overall coverage and Improve the accuracy of small area data 
white reducing the overall differentia! for population groups and geographic 
areas. 



Improved address lists and automation are two of the keys to goals (a) and (b), but 
by focussing on them the Census Bureau has Implicitly assumed that these will help 
achieve goal (c). Today's presentations have not provided any direct support for 
such an assumption, and I continue to be concerned about overall coverage, 
undercoverage of selected population groups and areas, and the accuracy of small 
area data. 

The outside experts who have examined the Census Bureau's plans on processing 
procedures and commented upon them this morning have all praised the Bureau for 
its efforts. I concur in this praise. But some of these experts have also expressed 
concern that the 1990 Census will be done with 1970s and 1980 technology, i.e. that 
there is too much satisfaction with the status quo and not a rapid enough movement 
towards techniques and approaches already in widespread use outside of the Census 
Bureau. They have also stressed the need for research on new approaches to 
Census-taking if we are to have quality and cost-effective population data in the 
future. Wnat follows is a brief summary of the comments of each of the four 
experts. 



Bailar {1984} 




104 



Testimony of Dr. Benjamin F. King 

Dr. King's evaluation of the Census Buroau's compilation of accurate lists of 
addresses was quite laudatory. Such lists form the backbone of Census operations, 
and Dr. King found the results of recent tests supporting the use of outside vendor 
lists to be compelling and conclusive. 

Testimony of Dr. Daniel G. Horvitz 

Dr. Horvitz examined the Census Bureau's new approaches to automatod address 
control files and to automated check-in. While he was supportive of thr direction of 
current activities in this area, and the likely improvements in productivity and quality 
that they should produce, he expressed concern that Census Bureau staff were too 
tentative in their adoption of new computer-based technology. In particular, Dr. 
Horvitz noted the uses he would make in automated data collection and check-in 
procedures of Computer Assisted Telephone Interview (CATI) methodology. Dr. 
Horvitz also suggested a radically different approach to doing the census that would 
spread data collection over the decade instead of implementing a massive 
mobilization once every 10 years. 

Testimony of Professor William F. Eddy 

Professor Eddy also reviewed the materials on automated address files and check- 
in, and his remarks were complementary to those of Dr. Horvitz, encouraging the 
Bureau to makr more effective use of compute; technology. Professor Eddy strongly 
supported the ureau's move towards concurrent processing in 1990, but also 
suggested that the Bureau move more boldly away from its plan to sort the paper 
forms. Professor Eddy concluded his testimony with some speculation of the impact 
of computers on census-taking in 2000 and beyond. Actually, part of this speculation 
is already a technological reality and should be influencing thinking about the 1990 
Census. 

Testimony of Mrs. Judith S. Rowe 

Mrs, Rowe provided the Subcommitee w.th a detailed evaluation of the Bureau's 




105 



5 



plans for data products and dissemination. She supported the proposed schedule for 
release of data products, but expressed reservations about the Bureau's current 
thoughts on media for dissemination. In particular. Dr. Rowe suggested a continued 
movement away from massive printer reprints and especially away from microfiche. 
She agreed with the focus on the emerging CD-ROM or laser-disk technology and 
argued that laser disks should replace both microfiche encf floppy disk technology. 
She supported the role of CENDATA as a dissemination approach for some classes 
of users. Finally. Dr. Rowe encouraged a continued focus by the Bureau on timely 
release of accurate 'data, and discouraged the Bureau from attempting to compete with 
outside groups and firms in the area of software development. 

ON WHAT OTHER TOPICS SHOULD ATTENTION BE FOCUSSED? 

I note that these comments from the outside experts have much in common with 
the recommendations and comments of the Committee on National Statistics' Panel 
on Decennial Census Methodology. In many ways, the Panel took as its starting 
point that the structure of the 1990 Census would resemble that for 1980. and the 
Panel supported the Bureau's approach to address list compilation and "its efforts to 
develop improved automated procedures that have the potential to speed up data 
collection, improve accuracy, and reduce costs." The Panel's report then focussed on 
four other topics: 

(i) Plans for research on and experimentation with new methodology in the 
pretests and in the 1990 Census. 

(ii) Improvements in questionnaire content and the evaluation of questionnaire 
changes. 

(iii) Coverage improvement evaluation, 

(iv) Estimation and the adjustment of census data for undercount and 
overcount. 

Each of these areas is critical to the success of the 1990 Decennial Census and each 
has been the focus of research and evaluation activities both within and without the 
Census Bureau. 

Consider, for example, the issue of estimation and adjustment. The Bureau itself 




106 



has raised the issue of whether the census ought to be viewed as a counting effort, 
an estimation effort, or some combination of both. Many of us outside the Bureau 
have come to think about census-taking as a statistical estimation problem, but we 
agree with Census Bureau staff that the census effort should be some combination of 
estimation and counting. The issue is then what combination. Even though the 
Census data will not be gathered for another four years, the Bureau needs to set in 
place, in the near future, operational versions of adjustment procedures if the full 
value of census adjustment and estimation is to be realized. This means that a firm 
timetable must be established that allows for outside scrutiny and comment and for 
congressional oversight and guidance. 

Professor Wm. Kruskal of the University of Chicago has remarked: 

The decennial census is a national ceremony and a symbol of the 
relationship between citizen and government. Whatever one's view of the 
census, whatever one's philosophical position about the Federal Government, 
it may be argued that the census is one of our relatively few national, 
secular ceremonies. It provides a sense of social cohesion, and a kind of 
nonreligious communion: we enter the census apparatus as individual 
identities with a handful of characteristics; then later we receive from the 
census a group snapshot of ourselves at the ceremony date. Like many 
family pictures, the snapshot is a little blurry in spots, but recognizable and 
fascinating to compare across the decades. 

These symbolic matters are not just poetic speculation. I believe that 

they play important roles in the actual carrying out of the Census, in 

congressional digressions of the Census, in beliefs that take extreme 

positions about the accuracy of the Census, and no doubt in other ways. 

Kruskal (1984, pp. 49-50) 
But if we are to go beyond the ceremony and the symbolism, we need to look 
carefully at the uses of the decennial census and we need to make a professional 
evaluation of the methodology for data acquisition and analysis. This 
Subcommittee's ongoing review and oversight of the Census Bureau's plans is a 
critical component in the decisions on methodology for the 1990 Census. These 
decisions will help to shape the information base that will guide government policy 
jnto the 2 1 st century. It is my privilege to assist you in this review. 



110 



107 



i 



RESPONSES BY STEPHEN E. FIENBERG TO 
QUESTIONS POSED BY CHAIRMAN GARCIA FOR 
THE RECORD OF THE HEARING OF MAY 1. 1986 



1. Question: Please inform us where your plans stand in reviewing the Census 
Bureau's decennial plans. When will the panel meet next to review the Census 
Bureau's plans? 

Answer: As you are aware the Committee on National Statistics' Panel on 
Decennial Census Methodology completed the first phase of its work with the report 
issued in September 1985. There had been tentative plans for a continuation of the 
Panel's activities, including intensive reviews of methodological planning and the 
evaluation of results from the 1985 and 1986 test censuses. The Bureau of the 
Census did not fund this second phase and only recently extended the contract for 
support of the Panel in a substantially reduced amount. This additional funding will 
cover a single meeting of the Panel, and we have tentatively scheduled this meeting 
for September, a full year after the Panel last met. We have also had preliminary 
discussions with Bureau of the Census staff regarding funding for the Panel for 
FY 1987. 

2. Question: Considering all of the work of the Census Bureau and its plans for the 
1990 Census, what do you think are the two or three most important actions that this 
subcommittee could take to improve the census and help it to be more accurate, 
timely, and useful? 

Answer. The two areas in which the Subcommittee can most effectively assist the 
Bureau of the Census are in the procurement of computer equipment, and in oversight 
of plans for adjustment of censal counts. 

First i draw your attention to a recommendation by Dr. Horvitz in his written 




108 



2 



statement: 



"This Subcommittee, and the Congress, could assist the decision process 
at the Bureau by, first, demanding careful documentation of the computer 
hardware requirements to conduct an automated census at the planned level; 
second, based on this documentation, negotiating a mutually acceptable set 
of requirements; and third, providing some assurance to the Bureau that the 
separate budget line item for the computer hardware included in the mutually 
acceptable set of requirements will remain intact. I recommend completion 
of this process no later than the end of this fiscal year." 

The Bureau needs to know what hardware it will use and it needs to develop the 
software for that hardware in order that it can be tested in the dress rehearsal 
scheduled for 1988. 

Second, I would urge the Subcommittee to schedule a hearing to review plans for 
adjusting the census, and to set a firm schedule for decisions regarding adjustment. 
Adequate time must be allowed for professional and congressional input on the issue 
of adjustment. Attention to this issue by the Subcommittee can help the Bureau 
reach a timely and reasoned decision. 

3. Question: Do you think that the Bureau's processing plans will make adequate 
allowance for evaluating the census coverage on time to report it when the 
apportionment figures are completed? 

Answer; The Bureau's move towards concurrent processing should allow for census 
coverage evaluation in a more timely fashion in 1990 than was the case in 1980. 
Nonetheless, if 1980 is to serve as a guide, the Subcommittee is wise to be 
concerned about whether there is adequate allowance of coverage evaluation prior to 
the deadline for reporting of apportionment figures. After all. in 1980 the Bureau's 
demographic analysis initially showed a net overcoverage of the white population, a 
result which subsequent analyses showed to be incorrect. New approaches to 
coverage evaluation, if adopted, would offer greater assurance that an adequate 
evaluation would occur by the time apportionment figures are due. 




109 



3 



The Panel on Decennial Census Methodology made several suggestions for 
improvements in the area of coverage evaluation* including the use of sampling for 
follow-up of nonrespondents and coverage and the use of systematic observation. 
The Bureau had rejected most of these suygestions for 1990. One final method for 
improving the timeliness of coverage evaluation is through the use of j pre- 
enumeration survey in place of the post -enumeration survey used in 1980. This 
approach is still under consideration by the Bureau. 

4. Question: Please comment on the Census Bureau's internal reps; i that suggests 
they will decide whether to adjust the census after all the figures are in. 

* In your experience as a statistician, are there adequate procedures and 
agreed upon standards available to allow the Census Bureau director to 
make an unambiguous decision about adjustment after the results are 
known? 

Answer: I have been told about the report to which you refer and I think that any 
plan that will defer the decision on whether to adjust the census until all the figures 
are available is sheer folly. No other action I can think of is more likely to invite a 
repeat of acrimony and law suits that followed the Bureau's decision not to adjust in 
1980. This plan could destroy the credibility of the Bureau of the Census in other 
technical areas. 

The basic methodology for adjusting the census using data from pre-enumeretion 
or post-enumeration surveys is weM-developed and widely-accepted. It is my belief 
that adequate procedures and agreed-upon standards are available or could be 
developed in the near future in order to allow for an unambiguous decision about 
adjustment in advance. I would urge that closure on the details of adjustment 
methodology should be reached in the near future and that a full-scale test be 
conducted in conjunction with the census dress rehearsal. A final and reasoned 
decision could then be made well in advance oi "~p 1990 Census. 




110 



Mr. Ackerman. Dr. Fienberg, thank you very much for your re- 
marks and your testimony. The subcommittee greatly appreciates 
it and will request that you as well, if you will, be prepared to re- 
spond to any questions that Chairman Garcia may direct to you for 
inclusion in the record. Thank you very much and thank you all 
very much for being here. 

The subcommittee stands adjourned. 

[Whereupon, at 11:55 a.m., the subcommittee was adjourned.] 



114 



CENSUS QUESTIONNAIRE AND AUTOMATION 



THURSDAY, MAY 15, 1986 

House of Representatives, 
Subcommittee on Census and Population, 
Committee on Post Office and Civil Service, 

Washington, DC. 

The subcommittee met, pursusuic to notice, at 10:15 a.m., in room 
311, Cannon House Office Building, Hon. Bob Garcia (chairman of 
the subcommittee) presiding. 

Mr. Garcia. Good morning and welcome to our hearing on the 
census questionnaire and automation. This hearing is a continu- 
ation of our series of hearings on the Census Bureau's plans and 
activities on the 1990 decennial census. 

Today, our focus is on the suitability of the census questionnaire 
content and design to the available technology. We will cover two 
areas. 

First, we will explore the possibilities of creating a shorter 
census questionnaire form. We have learned that not all data col- 
lected from 100 percent of the population are processed and dis- 
seminated. In 1987 the Census Bureau is due to report to Congress 
on the types of questions it plans to ask on the decennial, and in 
1988 the Bureau will report on the actual content of the question- 
naires. In order to curtail the huge costs involved with taking a de- 
cennial, maybe there are some questions that do not have to be 
asked of the 100 percent of our population but to only a segment of 
the population. 

The second area we will look into is the data conversion methods 
which the Census Bureau has considered for the 1990 decennial. By 
data conversion methods, I mean the ways in which the Bureau 
will convert data from questionnaire forms to computers. I want to 
find out the reasons behind the Bureau's recent decisions to rely on 
technology that dates back to the 1950's. 

Through this hearing, I would like to find out what obstacles the 
Bureau confronted that may not have allowed them to take full ad- 
vantage of modern advanced technology. Considering that we are 
only 4 years away from the 1990 census and that the Bureau has 
made its decisions regarding automation as it relates to the ques- 
tionnaire content and design, it looks as though the 1990 census is 
already a lost opportunity. But perhaps in carefully reviewing the 
background of the Bureau's decisions, we can look into opportuni- 
ties for the decennial in the year 2000. 

I would like those who are here representing the Bureau of the 
Census to know that yesterday we had a meeting with GAO. We 
went over some of the problems that they anticipate. I want to be 

(111) 



112 



perfectly fair in terms of letting you know that these discussions 
have taken place There are some points that they brought up in 
yesterday's meeting with me and the staff of the Subcommittee on 
Census and Population that I hope we would be able to get to 
today. 

Sometime after you finish your presentations, I would like GAO 
to come up. Then we can have an open dialog as to what the views 
are and how, hopefully, GAO, as the body monitoring what Census 
is doing, can be helpful in a constructive fashion to those of you at 
the Bureau of the Census who have to make these decisions. 

I just want to make it very clear — we are here as a committee 
offering absolute cooperation trying to fmd out how the three of 
us— -that is GAO, the Bureau of the Census, and Congress — can 
work actively together to ensure that we get a census that will be 
cost efficient and accurate. I think that is really going to be the 
key to any dialog we have today. 

1 would go a step further. I would hope that we would be able to 
conclude this hearing within 1 hour 15 minutes, IY2 hours tops. 
There may be some votes that come up on the floor of the House at 
approximately 11:15 to 11:30 a.m. &, what we will try to do is to 
expedite this. 

Now, Ms. Susan Miskura, who is the Chief of Decennial Planning 
Division, U.S. Bureau of the Census; Gene Dodaro, Associate Direc- 
tor, General Government Division of the U.S. General Accounting 
Office; and Gail Franke, who is the vice president of Federal Gov- 
ernment marketing, National Computer Systems, are our witnesses 
today. 

OK, what we will do is start with Ms. Miskura. 

STATEMENT OF SUSAN MISKURA, CHIEF OF DECENNIAL 
PLANNING DIVISION, U.S. BUREAU OF THE CENSUS 

Ms. Miskura. Thank you. Mr. Chairman, I will make brief re- 
marks here and submit my entire testimony for the record. 

With me is Peter Bounpane, Assistant Director for Demographic 
Censuses. 

I will discuss four topics today related to census planning. 

The first topic I will talk about is the content of the question- 
naires for the 1990 Census of Population and Housing. As we seek 
to determine questionnaire content for the census, we have one 
overriding goal: We must balance the needs for data against the 
length of the census questionnaire and the amount of time it takes 
respondents to fill it out. 

On the one hand, we must make sure that the 1990 Census of 
Population and Housing collects all the critical data our Nation 
needs to address population and housing issues in the 1990*8 and 
beyond. 

On the other hand, we realize that public cooperation could be 
undermined if the questionnaire is too lengthy, or contains items 
that do not meet important public needs. 

We believe that we struck the proper balance for the 1980 
census. Public cooperation and acceptance of the importance of the 
census was excellent. In 1980, over 80 percent of the households re- 



116 



113 



turned the census questionnaire by mail. This is quite an achieve- 
ment in a society as complex and mobile as ours. 

To make sure that we ask only those questions that meet impor- 
tant public needs, we have held discussions with data users in a 
number of forums. These include local public meetings, special con- 
ferences, interagency working groups, and the Federal Agency 
Council. During these discussions, we are hearing many more le- 
gitimate and valid data needs than we can possibly satisfy. 

The census must collect data that are required to meet demon- 
strated public needs, or that are required to fulfill legal mandates 
and implement governmental programs. We asked the Federal 
agencies to identify all legal mandates and programs requiring cer- 
tain data. 

Census Bureau specialists apply a number of other criteria to de- 
termine a set of potential items for inclusion on the questionnaire. 
We then test proposed new items, or modified wording, format, and 
sequencing for items that have been asked in previous censuses. 

The National Content Test is our main testing vehicle. It is de- 
signed to provide information on the reliability of the data collect- 
ed, and the ability and willingness of respondents to answer the 
questionnaires. 

I mention in our written testimony some of the new items we are 
testing. At this time, Mr. Chairman, we believe that the question- 
naires, both the long and the short form, for the 1990 Census on 
Population and Housing will be about the same length as they 
were in 1980. 

We plan to ask the same population questions on a 100-percent 
basis as in 1980. The population questions asked of a sample of per- 
sons probably will be similar to 1980, but we are awaiting the re- 
sults of the National Content Test before making a final determi- 
nation. 

Three questions that appear on the housing page are really cov- 
erage questions. We are testing the possibility of reducing these to 
one question for 1990. We expect most of the housing questions 
asked on a 100-percent basis in 1980 will remain. We do plan to 
move the question on complete plumbing facilities to the sample 
questionnaire. 

The 100 percent housing questions were supported for inclusion 
by the interagency working group on housing issues. They also re- 
ceived strong support from local planners, especially from urban 
centers, who are major users of the data for census blocks within 
their cities. Several of the 100 percent items define housing units 
and help us to insure complete coverage of the population and 
housing inventory. 

With regard to the sample housing questions, we plan at this 
time to eliminate those dealing with stories and structure, elevator 
and structure, and cooking fuel. We are considering having one 
question instead of two on the number of automobiles and trucks 
in the household. 

We are testing the possibility of collecting certain housing data 
common to all units in a multiunit building by means of a struc- 
ture questionnaire. This questionnaire would be administered to a 
knowledgeable respondent such as an owner, manager, or superin- 
tendent of the building. This approach might enable us to collect 



117 



114 



more accurate data and reduce the number of questions asked of 
households in multiunit structures. 

We will report to our oversight subcommittees as the result of 
our tests of questionnaire content become available. 

The second topic I will discuss, Mr. Chairman, is questionnaire 
design research, including focus groups. It is important to design a 
census questionnaire that enhances the response to the census, and 
the correct completion of every item on the questionnaire. That is 
why we have given much attention to the subject in the past, and 
why we are currently doing so as we plan the 1990 census. 

We conducted a number of studies in connection with the 1980 
census and are conducting studies related to questionnaire design 
in our 1986 test censuses and in the National Content Test. 

One of the things we have found in our studies is that the entire 
questionnaire package bears looking at. That is, the envelope and 
the inserts as well as the questionnaires. Thus, in the National 
Content Test this year, we are testing two different envelope de- 
signs to see their effects on the mail return rates, response rates 
for individual questions, and data quality. One envelope was de- 
signed to be attractive, the other to look official. 

In the 1986 test censuses in central Los Angeles County and east- 
central Mississippi we are testing the effects of including a motiva- 
tional insert in the questionnaire mailing package. This test is de- 
signed to see whether a brief written appeal for cooperation accom- 
panying mailing out of the census form can improve mail response 
rates and reduce question nonresponse. 

Some of our studies have also examined the effect of having our 
questionnaire designed to be read by FOSDIC. We use FOSDIC to 
convert questionnaire data to computer-readable format in the 
1960, 1970, and 1980 censuses. FOSDIC does impose some con- 
straints on questionnaire design in terms of how questions are for- 
matted. Most of the questions are designed to be read by filling cir- 
cles. 

We conducted an alternative questionnaire experiment as part of 
the 1980 census in which we compared the 1980 FOSDIC question- 
naire to an alternative FOSDIC questionnaire and to a question- 
naire that was a non-FOSDIC form. There was very little difference 
in the mail return rates for the three questionnaire versions. We 
are now hiring a contractor to do additional research on question- 
naire design. 

Focus groups which typically consist of a dozen or so participants 
recruited from a target population are one medium for observing 
reactions to questionnaires. They can be used to gather ideas for 
more systematic research. We conducted focus groups as part of the 
1985 test census in Tampa, FL to gauge reaction to the optical 
mark recognition questionnaire. Participants reacted reasonably 
well to overall design of the questionnaire, but they mentioned 
problems with some questions due to format constraints. Prior to 
both the 1986 test censuses in both Los Angeles and Mississippi we 
conducted focus groups as part of our Outreach Research Program. 
In both sites, focus group participants raised questions about why 
we were doing the census in general, and why we asked specific 
questions. Questions were raised about several population items, in- 
cluding race and Spanish origin; about the coverage questions; and 



118 



115 



about some of the housing items. Many of the comments indicated 
that there is a need for more outreach and education in the decen- 
nial census: How it benefits society and the fact that it is both 
mandatory and confidential. 

We also conducted focus groups after the mailout of question- 
naires in Los Angeles as part of our research into nonresponse. 
Preliminary observations from Census Bureau observers are that 
few participants mentioned questionnaire design or content as rea- 
sons for failing to return the questionnaires. 

The third topic I will discuss is the technology we will use in 
1990 to convert questionnaire data into computer-readable format. 
As we mentioned at the May 1 hearing before this subcommittee, 
we recently decided that FOSDIC will be our primary data-conver- 
sion technology for the 1990 census. 

With the FOSDIC system we filru the questionnaires and use 
FOSDIC machines to read the data into the computer. We plan to 
set up FOSDIC systems in 10 to 14 processing offices. FOSDIC is a 
fast and accurate technology that has worked well for us in recent 
censuses. Of course, we wiH upgrade FOSDIC to make it even 
faster and more accurate in 1990. 

We will use keying as a supplement to FOSDIC for entering 
some of the handwritten data on the questionnaires into computer- 
readable form. 

The fourth topic I will discuss is optical mark reading technolo- 
gy. We have considered optical mark recognition as an alternative 
to FOSDIC early in our planning and we tested it in our 1985 test 
census in Tampa, FL. 

One of the primary reasons we considered OMR for data conver- 
sion was that it might give us more flexibility in decentralizing the 
census processing system. I will discuss the issues we considered in 
reaching the decision not to use optical mark recognition technolo- 
gy as the primary data conversion technology for 1990. 

First of all, there were technological concerns. The standards 
that must be met by any technology for converting decennial 
census are high. The OMR scanner we tested did not meet the 
proven speed and accuracy of the FOSDIC system that we used in 
1980. 

Second, there was the issue of decentralization. During this test 
we discovered that the technology required a carefully controlled 
environment, in terms of temperature and humidity. These re- 
quirements would place significant limits on the type of space that 
we could obtain for the more than 400 collection offices in which 
we hope to use OMR. Based on this experience, we were concerned 
about the requirements for maintaining a widely distributed OMR 
system. 

Third, there was the issue of timing. We did not believe that a 
reliable OMR scanner that corrected the problems we had observed 
could be fully tested before our automation decision date of Sep- 
tember 1986. 

Finally, there was the issue of cost. We estimated that there 
would be substantial developmental costs in design and fabrication 
of an OMR scanning system that would meet our needs. Our deci- 
sion was based on available evidence from the 1985 OMR teut expe- 
rience, the current state of the technology, the potential for decen- 



119 



116 



tralization, the census equipment acquisition schedule, and the 
costs involved. However, we do expect to consider OMR as part of a 
research and development program for the year 2000 census. 

Mr. Chairman, that concludes my testimony on questionnaire 
content, design and processing. I would be willing to answer any 
questions you might have. 

Thank you. 

[The statement of Ms. Miskura and her response to written ques- 
tions follow:] 



120 



117 



STATEMENT OF THE CHIEF OF THE DECENNIAL 
PLANNING DIVISION OF THE BUREAU OF THE CENSUS 



Susan H. Mlskura 



Before the Subcommltte on Census and Population 
Post Office and Civil Service Committee 
U. S. House of Representatives 
Kay 15, 1986 



Mr. Chairman, thank you for this opportunity to brief the Subu^mlttee 
further on plans for the 1990 Census of Population and Housing. As 
you requested, I will discuss four topics related to census planning: 
(1) questionnaire content, (2) questionnaire design research and 
focus groups, (3) data conversion methodology, and (4) optical mark 
recognition technology. 



Mr. Chairman, the first topic I will talk about Is the content of 
the questionnaires we will use In the 1990 Census of Population and 
Housing. We also discussed this topic at the hearing before this 
Subcommittee on September 26, 1985. 

The decennial census 1s the Nation's primary source of data for small 
geographic areas and small population groups. A general principle 
governs the selection of subject content for the census: The census 
must be aimed solely at data that are required to meet well demonstrated 
public needs or that are required to fulfill legal mandates or Implement 
governmental programs. 



Questionnaire Content 




118 



2 



The 1990 Census of Population and Housing will mark the bicentennial 
of census-taking 1n our country. From the very first enumeration 
1n 1790, the census has always been more than a simple headcount of 
the population. It has asked questions that mirror the concerns 
of our society. Over the decades, as our society became more complex 
and our government more sophisticated, we added questions to the 
census to meet new needs. By 1900, the census covered most of the 
population questions we ask today. Concerns about the Nation's 
housing during the Depression In the 1930' s led to the addition of a 
set of housing questions In 1940. The Census of Population became 
the Census of Population and Housing. 

As we seek to determine the questionnaire content for the 1990 Census of 
Population and Housing, we have one overriding goal: Je want to balance 
the needs for data against the length of the censi uestlonnal re and 
the amount of time 1t takes respondents to fill it. On the one 
hand, we must make sure that the 1990 Census of on and Housing 

collects all the critical data our Nation needs tt >dd - population 
and housing Issues throughout the 1990's and beyond. These data are 
used for nany Important purposes, from apportionment and redl strlctl ng 
to planning and Implementing soda! and housing programs and developing 
economic policy. On the other hand, we realize that public cooperation 
could be undermined If the census questionnaire 1s too lengthy or 
contains questions that do not meet Important public needs. 




119 



3 



He believe we struck the proper balance for the 1980 census- Public 
cooperation and acceptance of the Importance of the census was 
excellent. Over 80 percent of the households returned questionnaires. 
This 1s quite an achievement 1n a society as complex and mobile as 
ours, especially when we realize that there are many factors that can 
contribute to nonresponse In the census. 

The Applied Behavior Analysis Survey (ABAS), which we conducted during 
the 1980 census, Indicated that questionnaire length was not a significant 
contributor to nonresponse. The survey found that the primary contributor 
to nonresponse was the reported failure to receive a form 1n the mall. 
Subjective factors such as how difficult the respondent thought the 
form would be and how long the respondent thought 1t would take to 
complete the form were associated with whether or not the form was 
started. On the other hand, objective measures of respondent burden, 
such as type of form received (short or long) and household size were 
not associated with whether the form was started- Further, households 
receiving long forms were just as likely as households receiving 
short forws to start filling them out despite the fact that the long 
forms were perceived as more difficult to complete than short forms. 
This 1s also supported by the ft~t that the 1980 mall -return rates 
for short and long forms were not significantly different. 

Still, we believe there should be no Increase over 1980 1n net question- 
naire content for the 1990 census. We are looking for ways to shorten 




120 



4 



the questionnaires; but, as we hold discussions with a broad array of 
data users, we are hearing many more legitimate and valid data needs than 
we can reasonably satisfy. At this time, we believe the questionnaires 
(short form and long form) for the 1990 Census of Population and Housing 
will be about the same length as those used for the 1980 census. In 
1980, the short-form questionnaire asked at every household contained 
the 7 population questions asked of each person, 9 housing questions, 
and 3 coverage questions asked of all the population. The coverage 
questions are designed to make sure we count everyone at an address 
who should be counted there. The long-form questionnaire contained 
these questions plus the additional questions asked of only a sample 
(about 20 percent) of the population. 

To make sure that we ask only those questions that meet Important 
public needs, we have held discussions with data users In a number of 
forums. Local Public Meetings (LPMs), cosponsored by the Census 
Bureau and local and state organizations, were primary sources of 
Information on the uses of the data at the state and local level. 
The LPMs afforded a wide variety of users, from the private and 
public sectors alike, the opportunity to comment on the adequacy of 
the data and to suggest new or modified data elements for the upcoming 
census. At least one meeting was held In every state and we completed 
the last of the 65 meetings In October 1985. Other forums and special 
outreach efforts — such as conferences dealing with housing Issues or 
the needs for data on race and ethnic groups— also are major sources 




121 



5 

of suggestions on the content of the 1990 Census of Population and 
Housing . 

For determining Federal data needs, we have sought counsel from other 
agencies— through 10 Interagency Working Groups and through 0MB 1 s 
Federal Agency Council on the 1990 Census. We asked the Federal 
agencies to Identify all legal mandates or Federal programs requiring 
certain data. These exchanges have been Important channels of 
communication. 

Census Bureau specialists also apply a number of other criteria to 
determine a set of potential Items for Inclusion on the questionnaire. 
These Include, for example, whether the data are needed for small 
geographic areas or small, widely dispersed population subgroups. 
We then test proposed new Items and modified wording, format, 
or sequencing for questions that were asked In the previous census. 
The testing program will help us determine which of the many valid 
data needs can be pursued for the census. 

We have conducted several content studies during the past few years. 
The National Content Test, which we are conducting right now, 1s our 
main testing vehicle. This test Is designed to provide Information 
on the reliability of the data collected and the ability and wITMngness 
of respondents to answer the questions. The mall out for the National 
Content Test occurred In late March 1986, followup will continue 
through the summer, and we will complete analysis of the results 




122 



6 



this winter. This will allow us to report to Congress by April of 
1987 on the proposed subjects for the 1990 census. Additional smaller- 
scale tests will be needed after that as we decide on final question 
wording . 

Planning and consultation to date have Identified numerous new subjects 
for testing. The National Content Test forms, which Include both new 
proposals and traditional census questions, contain about twice as 
many Inquiries as were on the 1980 census forms. Testing will help 
us narrow the list of candidate questions, particularly If we find there 
are high nonresponse rates or data quality problems with some of the 
questions. Some of the proposed new or expanded topic areas we are 
testing In the National Content Test Include— 

Population: 

0 highest educational degree held (In addition to, or as a 
substitute for, years of schooling) 

0 disability limitations for children, and limitations In self- 
care and ■oblHty for the population 1n general 

0 receipt of benefits from government programs such as food 
stMps, Medicare, medicaid, and energy assistance 

0 health Insurance and pension coverage 

0 pension Income 

0 second jobs 

0 vocational education 




123 



7 



Housing: 

0 Identification of residential care facilities 

0 secondary heating fuel and equipment 

0 Identification of cooperatives 

0 Identification of congregate living units 

0 presence of smoke detectors 

0 condominium fees 

0 mobile home costs 

New questions can be added to the census only If we can Identify 
1980 questions that are no longer needed and can be removed; or by 
employing Innovative sampling techniques that would allow us to ask 
more questions without Increasing the reporting responsibilities 
for any one household. We are currently reviewing and discussing 
content sampling options. 

Now, Mr. Chairman, I will briefly outline some of the content changes 
we are considering based on what we heard In our discussions with 
local, state, and Federal data users. At this point, we plan 
to ask the sane population questions on a 100-percent basis as 
In 1980: name, relationship to householder, sex, race, age, marital 
status, and Spanish-origin. (The wording, format, or sequencing of 
these questions could change somewhat based on our test results.) 
The population questions asked of a sample of persons probably also 
will be similar to 1980, but we will await the results of the National 
Content Test before making a final determination. 




124 



8 



We are testing one question that could replace the three coverage 
questions that were asked on a 100-percent basis In 1980. ( H D1d 
you leave anyone out of Question 1 because you were not sure the 
person should be Iflsted?", M Did you ilst anyone In Question 1 who 
Is away from home now? N , "Is anyone visiting here who Is not 
already listed?") Although these questions appeared In the housing 
section of the short and long forms, they are not housing questions. 
They pertain to coverage of the population. We will reduce the 
number of coverage questions to one if we determine that this change 
has no adverse Impact on our effectiveness In Identifying potentially 
missed persons. 

With regard to the 100-percent housing questions, we do plan to move 
the question on complete plumbing facilities to the sample questionnaire. 
We will also replace the 1980 question on the number of units at a 
single address with a question on the number of units In a single 
structure. The latter was on the sample questionnaire 1n 1980. 

At this tine, we do not have specific plans for changes 1n the remaining 
seven 100-percent housing questions. All were supported for Inclusion 
on the short-form by the Interagency Working Group on housing Issues, 
which Included representatives from Federal agencies that have an 
Interest In housing programs. These questions also received strong 
support from local planners, Including those from urban centers who 
are major users of the data for census blocks within their cities. 




125 



9 



Just as we must ask a minimum of population questions to assure 
complete coverage of the population, several of the 100-percent 
housing questions— the number of units In a structure, access (separate 
entrance to living quarters), tenure (owned or rented), and to some 
extent, number of rooms— are essential to help us define housing units and 
assure complete coverage of the housing and the population universes. 
The 100-percent housing questions also serve to provide benchmark 
data t:.at make It possible to use the Information 1n the sample 
questions. There 1s also another reason for not moving some of the 
questions to the sample questionnaire: For some characteristics, 
such as whether a housing unit 1s part of a condominium or cooperative, 
the universe Is too small to allow us to collect adequate data on a 
sample basis. 

With regard to the sample housing questions, we plan, at this time, to 
eliminate those 'deal Ing with stories In structure, elevator In 
structure, and cooking fuel . We are considering having one question 
Instead of two on number of automobiles and trucks In the household. 

We are testing the possibility of collecting certain housing data 
common to all units In a multlunlt building by means of a structure 
questionnaire. This questionnaire would be administered to a 
knowledgeable respondent such as the owner, manager, or superintendent 
of the building. This approach might enable us to collect more 
accurate data and reduce the number of questions asked of each of the 




61-902 0-86 



- 5 



126 



10 



households In multlunlt structures. Among the questions that could 
be Included on the structure questionnaire are units In structure, the 
year the structure was built, type of heating fuel, source of water, 
whether a housing unit Is part of a condominium or a cooperative, 
and so forth. 



The second topic I will discuss, Mr. Chairman, Is questionnaire design 
research, Including focus groups. It Is Important to design a census 
questionnaire that enhances both the response to the census and the 
correct completion, of every Item on the census questionnaire. That 
Is why we have given much attention to this subject In the past and 
why we are currently doing so as we plan the 1990 census. 

While we ask focus groups to consider certain aspects of question- 
naire design, they can be used to collect Insights Into a number of 
subjects. Focus groups typically consist of 8-12 paid participants 
recruited from the general public according to a set of specifications 
that allow us to target selected population subgroups. Focus group 
research seeks to develop qualitative Insight and directions for 
future research on selected Issues, Focus groups can uncover under- 
lying motivations that can't be measured with a survey Instrument. 



Questionnaire Design Research and Focus Groups 




127 



n 



Participants respond to situations and Inquiries posed by a trained 
moderator. Answers and reactions are probed at the option of the 
moderator so they might be documented for review by the sponsor, 
I will discuss later our experiences with focus groups In the 1985 
and 1986 test censuses. 

In our research, we look both at the questionnaire Itself and at 
the entire questionnaire mailing package, Including the outgoing and 
return envelopes and the Instructions, 

I will begin by reviewing the research we conducted as part of the 
1980 census that relates to questionnaire design. I mentioned the 
Applied Behavior Analysis Survey earlier. We also conducted a Content 
Re1nterv1ew Study, In which we evaluated the quality of some of the data 
collected 1n the census. We evaluated the degree of response variability 
1n cases where Identical questions were asked In the relntervlew and 
the census and the degree of response bias In cases where more detailed 
probing questions were asked In the relntervlew than In the census. 
Information on data quality from the Content Relntervlew Study coupled 
with Information on Item nonresponse gives us an Indication of which 
census questions might pose problems for respondents. 

We also conducted a Knowledge, Attitudes, and Practices Survey to 
evaluate the 1980 public Information campaign. We Interviewed a 
sample of households to enable us to evaluate the campaign particularly 
among minority populations. (The sample was designed to allow us 




128 



12 



to get results for the Black and Hispanic populations separately.) 
There were some Important findings from this survey that relate to 
questionnaire design. The survey showed that the public Information 
campaign appeared to stimulate cooperative mall response behavior, 
especially among Blacks, Hlspanlcs, and low-Income segments of the 
peculation. Still, 25 percent of the respondents reported no exposure 
to the census prior to receiving the census questionnaire. Some 
groups might benefit from motivational messages on the questionnaire 
or In the mailing package. I will discuss later studies we are 
conducting In 1986 that relate to the questionnaire mailing package. 

He also conducted the Alternative Questionnaires Experiment In 1980 
to test several variations In the design of the questionnaires. The 
standard census questionnaires were designed to be read by FDSDIC 
(Him Optical Sensing £ev1ce for Jnput to £cmputer), which I will 
describe In more detail later. He compared two alternative question- 
naires (short and long-form versions of each) to the standard FOSDIC 
form. The first alternative questionnaire was similar to that used 
1n the 197D census: It was FDSDIC-readab'le and collected the household 
roster Information In a linear rather than the columnar format we 
used 1n 198D. That Is, person sections ran- left to right across the 
page rather than from top to bottom. The second alternative questionnaire, 
designed by an outside contractor, was not FDSDIC-readable. It 
presented the questions In a different format, 1t used color differently, 
and 1t altered question wordings. One effect of these changes as 




129 



13 



that the non-FOSDIC questionnaires were longer than the standard 
census questionnaires. The non-FOSDIC short form was 8 pages compared 
to 4 for the standard short form, and the non-FOSDIC long form was 32 
pages compared to 20 for the standard long form. 

There was very little difference In the mall-return rates for the 
three questionnaire versions. The non-FOSDIC short form was returned 
at a slightly higher rate than the standard short form, even though 
the former had about twice as many pages. We are also comparing 
FOSDIC and non-FOSDIC questionnaires In the 1986 National Content 
Test to see how the public will respond to each. 

In our 1986 test censuses In Central Los Angeles County and East 
Central Mississippi the questionnaire covers and envelopes were 
redesigned to be more colorful and attractive than those for the 
1980 Census. We are also testing the effects of Including a motiva- 
tional Insert In some of the questionnaire mailing packages. As I 
mentioned earlier, research conducted after the 198D census Indicated 
that for some people, the arrival of the census mailing package was 
the first they had heard about the census. Thus, the census mailing 
package Itself 1s a public Information vehicle and 1s a critical 
Information source for certain population subgroups. 

This test Is designed to see whether a brief written appeal for 
cooperation accompanying mall-out of the census form can Improve 
mall-response rates, lower question nonresponse, and Increase cooperation 




130 



14 



with follow-up enumerators. We are also looking at whether this 
general -purpose Insert has comparable effects for various population 
subgroups. 

The Insert Included red, white, and blue graphics and listed six 
reasons "to count yourself 1n on the census". The text was In both 
English and Spanish. We expect to have preliminary results from this 
study by late summer. 

In the 1986 National Content Test (NCT), we are also comparing mall- 
return rates, response rates for the Individual questions, and data 
quality for two alternate envelope designs. The 1980 Applied Behavior 
Analysis Survey showed that about 13 percent of nonrespondents (2 
percent of all households) never opened their census mailing packages. 
Another 24 percent of nonrespondents (or 4 percent of all households) 
opened the envelopes but did not start to answer the questionnaire. 
Thus, the studies we are conducting as part of the NCT are aimed at 
finding new ways to get more people to open the envelopes and start 
answering the questionnaires. 

One envelope was designed to be attractive and appealing. The assumptl 
1s that an attractive appearance motivates recipients to open their 
envelopes and respond to the enclosed questionnaire. The second 
envelope was designed to capitalize on the official nature of the 




131 



15 



census and attempts to use the authority and Importance of the 
Federal Government to motivate recipients. The "attractive" envelope 
has red, white, and blue graphics; the "official" envelope Is simply 
black print on white with no graphics. We will have results from 
this study during the summer. 

We are Issuing a request for proposals for additional work on 
questionnaire design. We are asking the contractor to observe and 
report to us on respondent behavior In filling two kinds of long-form 
quest1onna1res--F0SDIC readable and key-entry. The contractor will 
provide suggestions for Improving both kinds of questionnaires and 
will design and develop alternative long-form questionnaires compatible 
with our data conversion techniques and other forms design considerations. 
We expect this project to be completed by late this year. 

Now I will discuss our focus group research In both the 1985 and 1986 
test censuses. 

We used group research to assess the public's reaction to optical 
mark recognition (OMR) questionnaires In the 1985 Test Census of 
Tampa, Florida. OMR scanning requl rements pi ace some constraints on 
the size and appearance of the questionnaire. For example, the 
scanner used In the Tampa census could only accommodate a questionnaire 
measuring 8 1/2 x 11 Inches, which meant that all questions and 
answers had to be confined to this space. To learn how these design 
constraints affected respondents, we arranged for a contractor to 




132 



16 



conduct four focus groups comprised of middle- Income Whites, low- 
Income Whites, low-Income Blacks, and low-Income Hlspanlcs who were 
residents of the Tampa test census area. The contractor conducted 
group sessions approximately two weeks after Census Day. Discussion 
focused on the OH* short form, which 80-percent of the households 
received. We did not use OMR technology to process the long- form 
questionnaire. 

The focus group participants reacted reasonably well to the overall 
design of the OMR questionnaire. Nevertheless, there were several 
comments directed toward design elements. Participants said that 
questionnaire Instructions were crmplete and understandable, although 
It was apparent that people seldom bothered to read or consult them. 
Specific recommendations Included repositioning Spanish language 
directions from the bottom back face to a more prominent location and 
researching alternative formats and contents. 

In summary, OMR format constraints led to some question-specific 
problems that needed solving; a few other problems with the question- 
naire not related to the OMR format also surfaced. I will discuss 
OM* In more detail later. 

In the 1986 test census In Los Angeles we decided to use focus groups 
to gather Information to help determine why people responded or did 
not respond to the census. These focus groups were part of a larger 
Census Community Awareness Program. We recruited six focus groups, 




133 



17 



each consisting of 12-15 persons who were grouped based on whether 
they were respondents or nonrespondents, and whether they were Hispanic, 
Black, or Asian. 

The focus groups were conducted by a private contractor and a final 
report Is due from the contractor In May, At this time, only general 
observations from Census Bureau observers on the focus group sessions 
are available. 

In general, nonresponse appeared to be related to lack of foreknowledge 
of the census, the Interference of personal life events (father having 
a stroke, a child destroying the questionnaire, etc .) , or lack of moti- 
vation, Including misunderstanding of the purpose or Intent of the 
census. Few members of the focus groups said that questionnaire 
design or content was the cause of the failure to return It. 




134 



18 



Data Conversion Methodology 



He recently decided that our primary data conversion technology for 
the 1990 census will be FOSDIC. We believe chat this decision 
represents the best balance of staffing, equipment, and workload 
considerations as they relate to the processing and collection 
offices. 

FOSDIC Is an acronym for Him Optical Sensing Device for Input to 
£omputer. A version of FOSDIC was first used In the 1960 census and 
has been our data conversion technology In each subsequent census. 
The complete data-conversion system consists of automated high-speed 
cameras that film the questionnaires, film developers to process the 
raw film into rolls of microfilm, and the FOSDIC machines that scan 
the microfilm and record data on computer tape. We call this the 
FACT system which stands for FOSDIC and Automated £amera Technology. 

We also considered using keying as a primary data conversion technology 
and earlier 1n our planning for 1990, we evaluated the use of a third 
-chnology— optical owrk recognltlon(OMR) . The FACT system 1s fast 
accurate and worked very well In the 1980 census. But there are 
nlcal limitations to how many FACT systems we can build and 
n for 1990. We had been considering, therefore, data keying 
and 5 technology to give us maximum flexibility 1n decentralization. 
I w1U discuss later our experiences with OH* and the reasons why we 
are no longer considering 1t for the 1990 census. 




135 



19 



Both FOSDIC and keying are tested methodologies that have proven 
workable over the years. Keying poses few constraints on questionnaire 
design. FOSDIC does pose design constra1nts(form size and layout, 
answer circles, etc.) and technical constra1nts(paper quality, etc.). 
However, we have used FOSDIC successfully In each census since self- 
enumeration was first Introduced In i960. In the 1960 census, one of 
the criteria we used to determine whether we could extend the mall-out/ 
mall-back census method(and , thus, self-enumeration) was the degree 
to which answering FOSDIC-readable questionnaires posed problems 
for respondents. 

Mall-return rates, response rates to Individual Items, and data 
quality generally have been quite good In recent censuses. Never- 
theless, we would like to make Improvements and continue to Investigate 
ways to do so. There Is llttlr evidence, however, that the question- 
naire design constraints Imposed by FOSDIC seriously affect any of 
these three areas. Still, we conducted the Alternative Questionnaires 
Experiment as part of the 1980 census and plan to conduct additional 
research 1n this area for the 1990 census. 

If we had decided to fully decentralize the processing, we would have 
had to use keying as a primary data conversion methodology 1n some 




136 



20 



offices. Keying would not be a viable option as the sole data 
conversion technology for the entire census because of the large 
numbers of keyers and key stations that would be required. And 
we determined that having two primary data conversion technologies 
would have excessively complicated our processing system for 1990. 
We will use keying only as a supplement to FOSDIC for entering 
some of the handwritten data on the questionnaire Into computer- 
readable form. 

The Issue of "how" we will convert questionnaire data Into a computer- 
readable format Is Intimately tied to the Issue of "where" the data 
conversion will take place. These two 1 ssues— M how N and "where"-- are 
the major Issues confronting us as we plan to Implement concurrent 
processing In the 1990 census. We reported on the "how" and "where" 
Issues and on our plans to conduct concurrent processing at the 
hearing before this Subcommittee on May 1. The essence of concurrent 
processing 1s that we want to begin the conversion of questionnaire 
data to machine- readable form concurrently with the questionnaire 
collection phase. 

The "where" Issue Involves the number of processing offices and the 
degree of centralization or decentralization. In 1980, when we 
processed the census questionnaires sequentially, we had three 
processing centers. More processing centers are needed to make 
concurrent processing a feasible option, primarily because of the 
need to move materials quickly between processing and collection 




137 



21 



offices. Also, centralization of processing activities would require 
us to hire large numbers of employees In one employment area. We 
must weigh these concerns against problems related to decentralization, 
such as the need for more" hardware In a greater number of locations 
and the difficulties of controlling and supporting many processing 
offices. 

At the Nay 1 hearing, we announced our decision to have 10-14 
processing centers In the 1990 census, where we would use FOSDIC 
to convert the data to machine-readable form. For district offices 
In certain high population density areas the processing centers would 
receive the questionnaires, perform automated check-In using laser 
sorters, and perform an automated review of the questionnaires (edit), 
as well as data conversion. The district offices will be able to 
concentrate on field follow up activities for households that did not 
mall back their questionnaires or that mailed back Incomplete questionnaires. 
District offices In the rest of the country will receive the question- 
naires, use pen-shaped wands to perform automated check-In, and conduct 
clerical reviews (edits). Once questionnaires pass the edit, they 
would be sent on a flow basis to a processing center for data conversion, 
using FOSDIC. Here they will also undergo a computer review for quality 
assurance. 




138 



22 



Optical Mark Recognition Technology 



As I mentioned earlier, the Census Bureau has decided to cease the 

testing and consideration of optical mark recognition (OH*) technology 

as a primary data conversion methodology for the 1990 decennial 

census. Now, I will discuss our experiences with OMR and the Issues 

we considered In reaching the decision not to test It further. 

He undertook the Investigation of 0W technology with the hope that a 
commercially available scanner could be found with the ability to 
meet the unique requirements of a decennial census data processing 
environment. We requested and received funds for a 1984 research 
and development effort, and Initiated a competitive procurement 
process to acquire an OMR device for use In the 1985 test census. 
The General Services Administration Issued the contract order to 
National Computer Systems (NCS) for Its W201 OMR scanner. 

NCS Is recognized as a leader In the area of OMR technology. We 
determined that among commercially available scanners, Its W201 had 
the best potential for successful performance In the census environment. 
Use of this machine placed certain limitations on census operations 
(e.g., the size of paper we could use for the questionnaires, the 
type of marker the respondents had to use, etc.). Although the 
Census Bureau recognized these limitations to the use of the existing 




139 



machine, we believed the 1985 test experience would provide 
Important Information about the effects of these limitations on 
census processing requirements. V'e were hopeful about the potential 
for dispersing the equipment to perhaps as many as 400 collection 
offices. 

The OMR test In our Tampa, Florida, test census was tailored 
to .'eet the operational guidelines of the H201 scanner 
as defined by NCS, even though these guidelines differed greatly from 
what the Census Bureau would require during an actual decennial 
operation. For example, the scanner could only accommodate a question- 
naire measuring 8 1/2 x 11 Inches. The Census Bureau designed the 
test questionnaire within thl^ limitation. The scanner was engineered 
to read No. 2 lead pencils, and so the required pencil was enclosed 
In every OMR mailing package sent to Tampa households. A special 
data keying operation was set up to capture and convert Information 
from the sample questionnaires sent to 20 percent of the households 
1n the test census area since the scanner could read only single 
sheets and could not scan the multlpage ..census booklets. These 
sample multlpage booklets, although used to enumerate only one-fifth 
of the households during a census, represent as large a decennial 
processing workload as the short-forms. 

We carefully Identified hardware, software, and operational problems 
with the scanner. The Influence of these problems on the 1985 test 




140 



24 

and their Implications for the 1990 Decennial Census were evaluated 
and are fully documented In a report Issued In July 1985. 

Mr. Chairman, I will now summarize the major findings from this 
report. First there were environmental concerns. We discovered 
during the testing and debugging of the OMR system that stringent 
environmental conditions had to be maintained to limit hardware problems, 
and questionnaire feeding and reading problems. We had to subsequently 
Install Individually controlled humidifiers, dehumldlflers, heaters, 
and air conditioning within the specially modified space provided to 
house the scanner In the Census Bureau's permanent processing facility 
1n Jeffersonvllle, Indiana. The scanner was sensitive to temperature 
and humidity; unless the questionnaires were acclimated to the controlled 
environment they could not be scanned. These environmental concerns are 
particularly Important, because they would create stringent requirements 
for the space for housing decentralized OMR systems In 1990. 

Second, there were questionnaire design and printing concerns. The 
questionnaire had to be designed within a double-sided 8 1/2 x 11 Inch 
area. All questions, Instructions, and response fields were forced 
Into this limited space. Also, paper specifications proved to be 
restrictive and had to be altered before printing could be completed. 




141 



25 



Third, there were paper transport problems. To limit scanning 
problems encountered with the folded census forms, the vendor lowered 
the Input hopper capacity from 500 to 150 forms; we later had to 
lower the capacity to 100 forms. This significantly slowed production. 

Fourth, the scanner was unable to determine when more than one 
circle within an answer field was fl'iled in. Detection of multiple 
marks 1s Important so that only complete data would be accepted as 
val Id . 

Fifth, there were difficulties In evaluating scanner accuracy. 
The scanner was designed only for test scoring application where the 
documents to be scanned are filled out In a controlled environment. 
But In an uncontrolled census environment answer marks on the census 
questionnaires are often light, not precisely filled within the 
circle wall, or made by any Implement other than a pencil , and 
multiple entries or erasures are sometimes made within an answer 
field. Problems occurred In the test even though we provided 
respondents with the proper marking pencil, something that may not 
be possible 1n a decennial census. Although we did not evaluate 
the scanner's reading performance under these uncontrolled conditions, 
a decennial census 1s by Its nature an uncontrolled environment 
and we remain concerned about these problems. 




142 



26 



In summary, Mr. Chairman, the results of all of these tests Ir.dlcated 
that the W201 scanner could be a viable alternative for decennial 
census data capture only If substantial modifications to the census 
system were made. NCS acknowledged this fact and submitted an 
unsolicited proposal to develop and produce a prototype of a census- 
ready machine even before the first Tampa form was processed. 

In early planning for the 1986 Test Census of Central Los Angeles 
County, we considered further testing of OMR technology. In July 1985, 
It became clear that a revised OFR scanner could not be produced and 
procurred In time for use In the 1986 test census. In September 
1985, we Issued a Request for Information (RFI) to determine private 
sector capabilities to produce a satisfactory machine In time to be 
evaluated thoroughly In special purpose tests before the deadline for 
processing decisions In September 1986. After Issuing the RFI, we 
decided that pursuing OMR technology as the primary data collection 
methodology for the 1990 census was not a viable option. This decision 
was'b&sed'on four critical Issues, which I will now summarize. 

First, there were technological concerns. The standards that must be 
met by any technology for converting decennial census data are, by 
necessity, high. If any new technology for data conversion 1s to be 
considered a viable alternative, 1t must be shown to perform favorably 
when compared to the system for conversion used during the 1980 census. 




143 



27 

The OW* scanner we tested failed to meet the proven speed and accuracy 
of the system used 1n the 1980 census— FOSDIC . 

Second, there was the Issue of decentralization. One of the primary 
reasons we considered OMR for data conversion was that, Hke key-entry, 
1t might give us more flexibility 1n the degree of decentralization 
of the census processing system. Any highly decentralized processing 
system for a decennial census, however, must have very low mainte- 
nance requirements. Based on the 1985 test experience* we were 
concerned about managing a widely distributed 0W system. 

Third, thure was the Issue of timing. We face a very tight schedule 
for census decisionmaking about automation of data collection and 
processing operations because all required equipment for 1990 must be 
fully tested and 1n place for the 1988 dress rehearsal census. We 
had set a goal of deciding what system(s) will be used for primary 
data capture during the 1990 census by September 1986. Some outside 
observers believed even this s. ' , too little time for 
procurement *nd 1saplenentat1on. 

We did not believe that a reliable, OMR scanner that corrected the 
problems discovered 1n the 1985 test census could be tested fully 
before the automation decision date of September 1986. A best case 




144 



28 



scenario would be to procure and test an OMR device by September 1986 
and contract for production beginning In September 1987, The required 
modifications to the scanner, although not Impossible, represent 
significant engineering and technological changes that must be Identi- 
fied, designed, built, thoroughly tested and debugged, with final 
alterations made and retested before the scanner could be placed Into 
production In preparation for use In 1990, Even If a prototype of a 
full-featured census-ready scanner could have been built, the manu- 
facturing of sufficient numbers of machines to distribute OMR to many 
locations could not be guaranteed. 

Finally, there was the Issue of cost. There were substantial development 
costs 1n design and fabrication of an OMR scanning system that would 
meet the Census Bureau's basic technical and decentralization needs. 

He also questioned whether an OMR system developed to meet unique 
decennial data conversion requirements would have a high remarketing 
potential, or be of use to other Census Bureau programs after the 
1990 Decennial Census processing Is completed. 

Our decision was based on available evidence from the 1985 OMR test 
experience, the current state of the technology, the potential for 
successful decentralization, the census ADP equipment acquisition 
schedule, and the significant costs Involved. However, we will 
consider OMR as part of a research and development program for the 
2000 census, along with high-resolution Image capture and other 




145 



29 



technologies. By so doing, we will further our efforts to accomplish 
more efficiently and accurately what Is a very difficult and complex 
data processing operation. 



Mr, Chairman, that concludes my testimony on questionnaire content, 
design, and processing. At this time, we expect the 1990 census 
short- and long-form questionnaires to be about the same length as 
those for the 1980 census. We plan to make some changes In the 
specific questions or question wording and may make additional changes 
based on the results of the National Content Test, 

He plan to process the questionnaires using FOSDIC technology, as In 
1980, FOSDIC Imposes some design constraints on the questionnaires, 
but we are undertaking research that could lead to Improvements In 
the design of a FOSDIC-readable questionnaire and the other components 
of the mailing package. Although we determined not to use OMR 
technology In the 1990 decennial census, we will consider OMR as 
part of the research program for the 2000 census. 

In determining which questions to Include on the census questionnaire 
we are very careful to Include only those that meet Important public 
needs. Our experience has been that when we make the public aware of 
the Importance of the census and why we are asking each question, we 
achieve good public cooperation. 

I look forward to hearing any comments or questions you might have. 



CONCLUSION 




146 



Responses to Questions from 
Subcommittee on Census and Population 
to 

Susan Miskura 
Chief 

Decennial Planning Division 



QUESTION 1, We have just heard the GAO testify that you are not going to 
use the most cost-effective procedures to process the 1990 census. How 
do you respond to these charges? 

ANSWER: Many of the things we are doing new for 1990 will increase effi- 
ciency? For example, automating the geographic support systems — TIGER; 
having an automated address control file and doing automated check-in; 
and having automated management Information systems will be cost-effective 
and will directly respond to problems experienced in the 1980 census. 

We will only invest 1n automation that reduces costs or that 1s necessary 
for maintaining or Improving the quality of the census. While we cannot 
know at this time whether a specific automation decision will save money, 
we believe that our decisions will lead to a more efficient and accurate 
census. Automating census operations will allow us to replace labor- 
intensive and error-prone clerical operations with automated techniques 
that are quicker, more accurate, and more controllable. 

Cost-effectiveness must be examined 1n terms of the entire census process. 
What Initially may appear to be cost-effective for one particular aspect 
of the census may not be when all other related aspects of the census are 
considered. Furthermore, we must also consider the risk that new techno- 
logies might fail and the costs associated with such failure. 



QUESTION 2. On May 1st you announced that concurrent processing of the 
census forms would only occur in those areas that are hard to enumerate. 
According to the GAO testimony that we have heard today, this decision 
means that you will use a "modified 1980 system" for most the Nation. 
They say they are concerned that you have "foregone the benefits that 
could be derived from a more automated operation." Again, quoting them, 
they feel that your decision "compromised Its goals for automation and 
for the census as a whole." 

Now these are some pretty weighty criticisms of your plans. How 
do you respond to them? What steps are you taking to make sure that 
the goals for the 1990 census are not compromised? 



on 

the 1990 Census Questionnaires and Automation 
May 15, 1986 




147 



2 



ANSWER : Our processing decision, which we discussed In hearings on Hay 1 
and May 15, allows us to perform concurrent processing for the entire 
lower 48 states, not just for certain areas. (We have not determined our 
processing plans for Alaska, Hawaii, Puerto Rico, and the Outlying Are-:*.) 

For all areas, we will be entering completed and accepted data Into the 
computer on a flow basis, concurrent with field operations, and much 
earlier than vor the 1980 census. For all areas, we will havt an automated 
address control file and automated check-In and an automated management 
Information system. This plan will enable us to process data far sooner 
than we did In 1980 and therefore release data products earlier. 

The major difference between the district offices In the high population 
density areas and elsewhere Is that we will do the Initial edits by 
computer In the former and clerically In the latter. Even where we do 
the Initial edits clerically, we will still do computer edits later as a 
quality assurance measure. To do initial edits by computer everywhere 
would have required a far greater number of processing centers (with much 
more equipment and staff) than the 10-14 we now plan. 

We are planning several major automation advances over the 1980 census: 
(1) earlier data capture, (2) automated address control file and automated 
questionnaire check-in, (3) automated questionnaire edit, (4) automated 
management Information system, and (5) the automated geographic support 
system (TIGER). 



QUESTION 3. GAO stated In Its testimony that the size of the questionnaire 
form influencing the response rates Is particularly evident In Inner city 
areas. According to GAO, for the 1980 census, the mall return rates for 
the short form was over 7 percent better than the long forms In these 
hard-to-enumerate areas. In addition, according to GAO, rates for the 
short form questionnaires have been consistently higher than for the long 
forms In the 1985 and 1986 pretests. However, you've mentioned In your 
testimony that the 1980 mall-return rates for short and long forms were 
not significantly different. How do you respond to that? 

ANSWER : In the 1980 census, for the entire Nation, the mall-return rate 
for the short form was only about 1.5 percent higher than for the long 
form. For our centralized district office areas— which Included mostly 
hard-to-enumerate central cities— the mall -return rate for the short form 
was only about 2.5 percent higher than for the long form. 

In the 1985 and 1986 test censuses, there were larger differences between 
short- and long-form mall-return rates, about 8 to 10 percent. However, 
It Is difficult to draw any conclusions from these figures because we 
were testing new procedures In these sites, these are just a few locali- 
ties, and there was limited outreach available. The numbers from the 
1980 census are much more useful because .t was the national census 
complete with full-scale publicity and a much larger volume of forms were 
returned. 




148 



3 



QUESTION 4a. You've mentioned that one way to add any of the new questions 
tested at the National Content Test Is to find ways to employ Innovative 
sampling techniques that would allow you to ask more questions without 
Increasing the reporting responsibilities for any one household. How can 
that be done? Would you share with us some of the sampling techniques 
that you may be considering? 

ANSWER: Now that we have decided our basic processing plan for the 
1990 census, we are examining the cost and feasibility of using a kind of 
nested sampling technique. If we should adopt this technique, about 
19 percent of the housing units would receive questionnaires that include 
the basic saropls questions, those for which the data are needed for census 
tracts, pi aces > and so forth. (This questionnaire would be very similar 
to the 1980 sample questionnaire.) An additional 1 percent of the housing 
units would receive a different sample questionnaire that Includes different 
questions, those for which the data are needed only for very large geographic 
areas such as for states, the largest metropolitan areas, and certain large 
large counties. This would create two samples-a 19-percent sample and a 
1-percent sample. The total sample size, 20 percent, Is roughly equivalent 
to the sample size In 1980. Alternatively, the split could be 18 percent 
and 2 percent. 



QUESTION 4b. In my understanding, the purpose of the National Content Test 
1s to Improve upon the questionnaire content of the census. Not to cast a 
dark cloud over your plans, but what will happen If the Census Bureau finds 
Itself being unable to add questions to the 1990 census either because existing 
questions cannot be dropped or because no innovative sampling techniques are 
found to be suitable? 

ANSWER: In that case, we wouldnot ask the additional questions. We do not 
plan to Increase the overall workload for respondents for the 1990 census. 
In our consultations with data users, we have heard many more legitimate 
requests for data than we can reasonably satisfy. Even by removing some 
existing questions and using different sampling techniques, we would not 
be able to satisfy all requests for new data. 



QUESTION 4c: What other benefits will the National Content Test give the 
Census Bureau? 

ANSWER : In addition to testing new questions In the National Content Test, 
we are also testing new wordings, formats, or placements (where the question 
1s placed on the form In relation to other questions) for 1980 census ques- 
tions that could Improve the quality of the data. We are also testing the 
effects of different envelope designs on mall -response rates. 




149 



. QUESTION 5. Please comment on GAO's finding that degree of literacy skills 
has a direct Impact on responses to questionnaire forms. 

ANSWER: We require only one knowledgeable respondent to provide data for 
a complete household. We realize that some persons and, perhaps even some 
complete households, do not possess the language skills to properly fill 
a census questionnaire or other types of documents. We provide telephone 
and walk-1n assistance 1n filling questionnaires and provide questionnaires 
and instruction guides 1n languages other than English. If a householder 
cannot complete a questionnaire using these services, we complete the Inter- 
view by personal visit. We send census enumerators to follow up on housing 
units that do not return questionnaires and telephone or personally visit 
housing units for which additional data are needed. 

We keep the size of the questionnaire reasonable by asking only required 
data, and we do extensive testing on questionnaire wording, design, and 
so on. 



QUESTION 6. At our May 1st hearing, the Census Bureau announced its decision 
to do concurrent processing 1n only hard-to-enumerate areas. Because under- 
count 1s a serious problem 1n these hard-to-enumerate areas, the Census Bureau 
developed local review programs in which local officials review the accuracy 
of numbers of areas covered by the census. It seems that concurrent proces- 
sing may hinder the effectiveness of the local review programs. What 1s 
your response that? 

ANSWER: We are committed to a successful Local Review Program for the 
1990 census. Concurrent processing will not hinder the effectiveness of 
the program. It should improve the program by making the data provided 
to the local officials more accurate and complete. By converting the 
questionnaire data to machine-readable format earlier 1n the census process, 
we will have more time for review and correction of the data and for early 
Identification and correction of coverage and other enumeration problems. 



QUESTION 7. What are the reasons for not doing concurrent processing for 
the entire nation? 

ANSWER : We will be doing concurrent processing for the lower 48 states 
and the D1 strict of Columbia. We have not determined yet our processing 
approach for Alaska, Hawaii, Puerto R1co, and the Outlying Areas, but we 
are committed to meeting the same goals for these areas— timeliness, 
accuracy, and efficiency—as for the rest of the country. 

QUESTION 8. You've mentioned in your testimony the Issue of cost related 
to adopting the OMR technology to the decennial processing. How much do 
you estimate the cost would be? What are the cost factors Involved? 

ANSWER: We have estimated that the research and development programs 
required to procure a prototype optical mark recognition (OMR) scanner to 
test 1n 1986 would cost at least $1,000,000. We also estimate that the 
unit cost for a production scanner for the 1990 Decennial Census would be 
about $150,000 each. Based on these estimates, we believe OMR technology 
Would be more expensive than FOSDIC (Film Optical Sensing Device for Input 
to Computer). 



61-902 0-86-6 




150 



Mr. Garcia. I thank you, Ms. Miskura. 

. Now I would like to invite Mr. Dodaro, who is the Associate Di- 
rector, General Government Division, U.S. General Accounting 
Office to come on up. 

As I said earlier, members of the subcommittee and I met with 
Mr. Dodaro for the purpose of going over what we would be talking 
about here today. 

Mr. Dodaro, it is good to see you and I guess you know the proce- 
dure. What we would like to do is to have you submit your testimo- 
ny for the record and that will be accepted without objection. If 
you would be kind enough to proceed. 

STATEMENT OF GENE DODARO, ASSOCIATE DIRECTOR, GENERAL 
GOVERNMENT DIVISION, U.S. GENERAL ACCOUNTING OFFICE 
Mr. Dodaro. Thank you, Mr. Chairman. Good morning. 
On my immediate left is Jerry Donoghue, who is responsible for 
our work on the questionnaire, and on my far left is Johnnie Butts, 
who is responsible for our work on census automation. 
[Mr. Dodaro presents slide presentation.] 

Mr. Dodaro. I would like to begin with a brief overview of what 
we reviewed during the course of our work. 

The first area we looked at was the census questionnaire itself, 
ilie design and content of the questionnaire drives many of the 
other decisions and activities that occur during the conduct of the 
census and greatly influences the outcome. The response rates, re- 
spondent burden, quality of the information, as well as the data 
processing options all emanate in part from the decision on the 
questionnaire. 

The second area we looked at was the Bureau's planning activi- 
ties for automating the 1990 census. Both the automation area, as 
well as the questionnaire area, have a major bearing on controlling 
the costs and improving the quality of the information during the 
upcoming census. 

About 70 percent of the $1.1 billion spent to take the 1980 census 
went for data collection, preparation, and processing. Additionally, 
during 1980 many labor-intensive activities were part of the census 
operation. 

In our opinion, and in the view of the Bureau, some of these 
areas would be fertile grounds for automating. 

For example: 37,000 clerks were used in 1980 to manually edit 
the questionnaires. Computerized editing of the questionnaires 
could not only reduce the number of people required, but also 
would achieve greater consistency in the edit, thus improving the 
quality of the data. Additionally, earlier capture of the information 
from the questionnaires would allow greater time for local review. 

There are three major areas that we would like to stress this 
morning. 

The first is that we think the Bureau is missing an opportunity 
to test a more usei^friendly, easier-to-complete short-form question- 
naire. We think that this could increase mail response rates, par- 
ticularly in hard-to-enumerate areas. 

Second, we think the Bureau has not moved as aggressively as it 
needs to in embracing new technologies for capturing the informa- 



154 



151 



tion from the questionnaire. The Bureau started too late in its 
planning activities and has been moving too slowly, and now finds 
itself in a position of falling back upon data capture technologies 
that it has used in the past, because too little time is available for 
exploring additional options. 

Third, we are skeptical that the Bureau will be able to achieve 
its goal of holding the cost, on a per-household basis, exclusive of 
inflation, in 1990 similar to that that was experienced in 1980. All 
indications are that the 1990 census costs will increase. 

I would like to talk about each of these areas in a bit more 
detail. 

First, the short-form questionnaire. We focused most of our effort 
on the short-form questionnaire, which is sent to 80 percent of the 
households. We looked at the justifications that were provided to 
the Bureau for the need for that information, particularly the 
housing information. We also talked to Federal agencies as to how 
they would use the information that is collected. 

As you know, the Census short-form questionnaire contains 
seven population questions, three questions designed to improve 
the count, and eight questions on housing. We think most of the 
housing questions add to the length and complexity of the question- 
naire, and have a tendency to decrease the response. 

We think questions such as the value of property, amount of rent 
that is paid, as well as questions on the number of rooms, have a 
tendency to also detract from the Bureau's ability to obtain a quick 
and accurate population count earlier in the process. 

A number of the focus groups, which the Bureau discussed this 
morning, also echoed similar concerns. They raised questions about 
the complexity of the form, the need and the legitimacy of the Gov- 
ernment to ask many of the housing questions when the primary 
purpose of the census was to obtain the population count. 

Also, during public hearings, several persons raised concern 
about the length and complexity of the questionnaires. 

In our discussions with Federal agencies we found that many 
used sample data from the questionnaires even though 100 percent 
information was available. Also we found that Federal agencies 
asked for data at 100 percent for geographic levels that are already 
estimated from the long-form questionnaire. 

We also found that some write-in information that was asked for 
as part of the census was never captured and used as part of the 
processing activities for 1980. 

While we have not proved conclusively that the housing ques- 
tions on the short form are not needed, all indications point to the 
feasibility of removing those questions and testing a more stream- 
lined short-form questionnaire. This ought to be explored; particu- 
larly in view of the respondent burden associated with those ques- 
tions — of asking them of 100 percent of the Nation's households — 
as well as the costs that are likely to increase as part of the 1990 
operations. 

What do we think the test of a short form could result in? No 
one knows until the test is conducted whether or not the stream- 
lined form will increase the mail response rate. That is one of the 
reasons we advocate the test. 



155 



152 



However, traditionally, the short-form questionnaire has received 
a higher response rate than the long-form questionnaire. While 
there was only a 2-percent differential nationwide, that differential 
was higher in hard-to-enumerate areas. 

Additionally, during the 1985-86 pretests the short-form ques- 
tionnaire has been receiving anywhere from 6 to 10 percent higher 
response rates than the long-form questionnaire. 

The response rate increase is very important, since the Bureau 
has estimated that for every 1 percent increase in the response 
rate, it can save $6 million in associated costs. This would include 
reducing followup costs; moreover more completely filled out ques- 
tionnaires would reduce some of the costs associated with sending 
enumerators to follow up, which is particularly occurring in the 
hard-to-enumerate areas. Streamlining the questionnaires would 
also have a tendency to eliminate some enumerator bias and we 
feel reduce the respondent burden. 

Removing the housing questions from the short-form question- 
naire would result in about 85 million households in 1990 being 
asked eight fewer questions, and could reduce the time it takes to 
complete the short form by one-third. 

Now in addition to the response and cost and burden, there is 
also another avenue that could be opened up to the Bureau if it 
decides to go with the more streamlined short form. That is, in in- 
creasing its options for data capture. This is particularly important 
as it relates to the use of the OMR equipment, as I will talk about 
in a minute. 

I would like to shift now from discussing the questionnaire to the 
Bureau's planning activities to automate the 1990 census. 

We feel that the Bureau could have moved sooner to begin plan- 
ning. The lack of detailed planning and advanced decisionmaking 
in this area has limited the Bureau's options for exploring new 
data technology for 1990. 

In 1982 we urged the Bureau to begin seeking out new ways to 
automate the 1990 census. But we noted at that time that its origi- 
nal planning activities were not very well coordinated. The Bureau 
agreed and affirmed its commitment to seek out options for auto- 
mating the census. 

But as late as December 1984, the Secretary of Commerce, in his 
report, under the Federal Managers Financial Integrity Act, noted 
that the lack of a master plan for conducting the 1990 census was a 
material weakness in the Department. 

The Bureau published a plan in February 1985 which set out a 
series of milestone decisions, but in our view the plan was not as 
integrated as it needed to be. 

A good illustration of what has occurred as a result of not 
making earlier decisions and preparing detailed plans early on for 
the 1990 census, is the Bureau's efforts to use the optical mark 
reader technology. 

Early on, the Bureau began exploring the use of OMR as its pri- 
mary data capture technology for 1990. However, it limited its con- 
sideration of the optical mark reader to commercially available, 
off-the-shelf equipment. This equipment provided several potential 
advantages and disadvantages. 



156 



153 



On the advan tages side, it offered a one-step processing activity, 
as compared to the Bureau's three-step filming, developing, and 
scanning process used under the FOSDIC system. It also offered 
the advantage of being used in a more decentralized operation and 
easier to operate chan some of the FOSDIC operations which re- 
quire some knowledge in such subjects as chemistry. 

On the disadvantages side, however, there were a number of lim- 
itations of the commercially available equipment to meet the Bu- 
reau's requirements. 

For example, the optical mark reader would only read an 8V2 by 
11 inch form, where the Bureau's short form alone was approxi- 
mately 11 by 28 inches. 

Second, the optical mark reader could not read the Bureau's long 
form multipage booklet unless the pages were separated. 

Third, the optical mark reader was meant to be used primarily 
in controlled environmental situations where people use specialized 
writing instruments and humidity and other environmental factors 
could be controlled. 

Despite the knowledge that this piece of equipment could not 
meet the Bureau's requirements, the Bureau went ahead and 
tested it during the 1985 pretest. While the pretest showed the 
magnitude of the problems, it proved very little other than that the 
equipment could not meet the Bureau's requirements. 

The vendor had proposed to the Bureau in January 1985 and 
again in April 1985 that it could develop a modified optical mark 
reader to more appropriately satisfy the Bureau's requirements. 

The Bureau moved slowly however, on this request, and in No- 
vember 1985 notified the vendor that the OMR would no longer be 
considered for the primary data capture in the 1990 census. As a 
result, the optical mark reader was not tested during the 1986 pre- 
test, although a stated objective of those pretests was to consider 
alternative data capture technologies. 

In summary, from the beginning the Bureau knew that the com- 
mercially available equipment would not meet its known require- 
ments and expressed a real reluctance to change those require- 
ments. 

In our view, moving ahead and testing it, absent the possibility 
of changing the requirements, was not a prudent use of time. A de- 
cision could have also been made earlier in the decade to finalize 
those requirements and begin a research and development effort to 
come up with a modified optical mark reader that could have satis- 
fied the Bureau's requirements. 

Where do we stand today? As the Bureau mentioned this morn- 
ing, it has made its decision to use FOSDIC as the primary data 
capture system for 1990. We are encouraged that the Bureau made 
the decision in April, which is 5 months earlier than it originally 
planned. And we are also pleased to see that the Bureau has not 
opted to use data keying as a primary data capture. 

However, there are a number of concerns that emanate from this 
decision. Essentially, the Bureau is back to using the data capture 
technology that it has used since the 1960's. They have proposed 
some modifications. However, the modifications have not yet been 
fully explored, and we are likely to have many manual operations 
that occurred in 1980 repeated in 1990. For example: About 93 mil- 



- 157 



154 



lion questionnaires during the 1990 census could be manually 
edited again. Also, I know the Bureau is attempting to sort through 
some of the logistical and management problems stemming from 
using 10 to 14 processing offices as opposed to the 3 used in 1980. 

The questionnaire, as the Bureau indicated this morning, basical- 
ly will remain unchanged from the 1980 questionnaire. 

There are a few other concerns that we have, that we would like 
to talk about this morning. 

The first is that we think the cost will continue to increase for 
the 1990 Census. The many manual operations likely to be repeat- 
ed, combined with an estimated workload increase of an additional 
18 million households that the Bureau would have to canvas in 
1990, all are likely to escalate the cost. 

Census Bureau temporary labor costs have been reported to be 
increasing, in excess of the general rate of inflation and the 
Bureau is likely to use in the neighborhood of several hundred 
thousand people again as it did in 1980 to take the 1990 Census. 

Additionally, as I mentioned, there is a number of potentially 
greater management problems that the Bureau will have to con- 
front in using more processing offices. 

What can be done at this point? The Bureau has been reluctant 
to test the shorter, more streamlined shortform questionnaire with- 
out the housing questions. We think such a test may be justified, 
although it would have to be carefully designed and explored; and 
the results of that test weighed against meeting the needs of the 
data users. 

There also needs to be more attention given to coming up with 
costs associated with the Bureau's recent decision on processing 
and district office configurations. We believe cost estimates should 
be shared with the Congress and a dialog continued as to what are 
some options to reduce those costs in 1990, at least control them 
within a manageable level. 

We also strongly urge the Bureau, as it has indicated, to contin- 
ue considering alternatives including automating some of the 
manual activities for the 1990 Census. 

As I mentioned this morning, time is running out with regard to 
making any sweeping changes for the 1990 Census in the area of 
automation, but we would urge more attention being given to com- 
puterized editing of the information. 

That concludes our summary statement, Mr. Chairman. We 
would be glad to answer any questions at this point. 

[Statement of Mr. Dodaro and his response to written questions 
follow:] 



158 



155 



STATEMENT OF 



GENE L . DODARO 



ASSOCIATE DIRECTOR, GENERAL GOVERNMENT DIVISION 
UNITED STATES GENERAL ACCOUNTING OFFICE 
Mr. Chairman and Members of the Subcommittee: 

I am pleased to appear today to discuss preparations for the 
1990 Decennial Census in two interrelated areas t questionnaire 
development and data capture technology. Improvements in these 
areas could greatly contribute to controlling costs and enhancing 
the quality of the census. With me is Jerry Donoghue, who is 
responsible for our work on the ouestionnaire , and Johnnie Butts, 
who is responsible for our census automation work. 

The Bureau's planning and preparations, including its tests 
and decisions to date, have led us to believe that the 1990 
census will not be as cost efficient as it could be. The 
Bureau's reluctance to test a shorter short form, its 
questionable approach to procuring optical mark reader (OMR) 
equipment, and the questions raised by its recent decision on 
data capture and processing_of f ice configuration all point to 
missed opportunities to significantly improve upon the 1980 
census. 

ADVANTAGES OF A SHORTER 
SHORT FORM 

As we mentioned in our previous testimonies before your 
subcommittee in June 1984 and April and July 1985, we have 
reservations about the size and content of the short form sent 
to about 81 percent of the households in 1980. We continue to 
believe that the short form should be limited to the basic 



1 




156 



questions needed to obtain an accurate population count — that 
is, questions oriented towards population characteristics and 
those used to improve the count. Housing questions — such as 
plumbing, value, and rent of housing units — increase the 
Questionnaire's complexity and consequently discourage response. 
We believe that housing data obtained from the long form (sample 
questionnaire) may meet federal needs. This sample form 
contained not only the short form questions but more detailed 
questions on population and housing as well. One housing unit 
in six was asked to complete the long form in 1980 except for 
communities under 2500 people, where one half of the housinq 
units were sampled. 

Our more recent work has raised questions regarding the 
federal need for housing data from 100 percent of the nation's 
households. For example, we found that some federal data 
users were actually using sample data even though 100 percent 
data or data collected from all of the households was available, 
and that some users had requested 100 percent housina data for 
geographical levels for which data are also estimated from sample 
questionnaires. The need for housing data from all households 
should be more closely weighed against associated collection 
costs and respondent burden. 

The content and design of the questionnaire is a major 
factor affecting response rates, quality of response, response 

2 



160 



157 



burden, and data processing requirements. Since about 70 percent 
of the $1.1 billion .in 1980 Decennial Census costs were incurred 
with the collection, preparation, and processing of the data, 
efforts to reduce the short form questionnaire, both in size and 
questions, could be cost beneficial. This is particularly 
important considering that the 1990 Decennial Census will record 
information from about 106 million households, 18 million more 
than for 1980. If the Bureau does not streamline and simplify 
the questionnaire form, it will be missing out on a potential 
cost savings opportunity, as well as a chance to improve the 
perceptions and receptiveness of the U.S. public to the census. 

Cost could be reduced 

The mail response rates in 1390 will have a direct impact 
on the nonresponse followup costs. Streamlining and simplifying 
the short form should improve the mail response rates for 1990, 
and might greatly reduce the hiqh cost of sending out enumerators 
to followup with nonrespondents , particularly if better response 
rates are obtained from the hard-to-enumerate areas . For 1990, 
the Bureau estimates that each 1- percent increase in the mail 
response rate would save about $6 million in followup costs. 

That the size of the questionnaire form influences the 
response rates is particularly evident in inner city areas. For 
the 1980 census, the mail return rates for the short form was 
over 7 percent better than the long forms in these 



3 




158 



hard-to~enumerate areas. Also, rates for the short form 
questionnaires have been consistently higher than for the long 
forms in the 1985 and 1986 Pretests, as shown below. 



Pretest 

Jersey City 39 

Tampa 58 

Los Angeles 36 

East Central Mississippi 65 



Response Rate 
Short form Long form 



31 
48 
28 
59 



We do not know the extent to which a shorter short form 
will increase response rates. However, without a test no one 
will know. Considering that a short form will be sent to about 
85 million households in 1990, and the pressures to hold down 
government spending, evaluation of a revised form seems 
worthwhile. 



Some enumerator visits also could be eliminated with a 
shorter form because fewer respondents may return incomplete 
questionnaires which require enumerators to collect the missing 
data. Of the 64 million questionnaires that were returned by 
mail in the 1980 census, 13 percent of the short forms did 
not meet the Bureau's standards for completeness. On the other 
hand, 36 percent of the long forms did not meet these criteria. 



4 



162 



159 



A shorter, simplified short form questionnaire also could be 
processed at less cost. However, the Bureau must act on the 
size of the form to potentially benefit from data processing 
savings. Also,, a reduced short form may provide the Bureau with 
cost-saving options for its data automation decisions. 

Respondent burden reduced 
with fewe r qu estions 



Reducing the size of the 1980 short form will benefit 
respondents because the burden of completing the form will be 
reduced in terms of number of questions, time, and in the 
perception of difficulty. 

With a short form oriented toward population and coverage 
only, about 85 million households would answer fewer questions. 
This could result in up to a one-third reduction in the time to 
complete the questionnaire. 

The perception of the questionnaire is an important factor. 
It is a burden for the respondent if the individual perceives the 
form to be a difficult task, and this could affect the completion 
of the form. The Bureau has indications that perception of 
difficulty is a factor affecting form completion. 

For example, a Bureau post-1980 Census study found that 
nonrespondents attributed not starting to fill out the form to 
the perceived difficulty of the task. The study showed that "the 

5 



183 



160 



easier to fill the form was perceived, the more likely it was to 
be started." The Bureau also found that, for the critical phase 
of finishing the form, nonrespondents attributed not completing 
the form to the "amount of work involved." In another study, 
the Bureau discovered that the difficulty — experienced or 
perceived — associated with completing a self-enumeration form 
can adversely affect response to subsequent items on the form. 

Confirming these study findings, enumerator supervisors at 
the Jersey City Pretest site told us that some respondents 
commented after enumeration interviews that the short form 
appeared complicated. This concern, along with the perception 
that the form was too lonq, was also expressed by focus groups at 
the Los Angeles and Mississippi test sites. 

PARTICIPANTS IN RECENT FOCUS-GROUP 
STUDIES WERE CONCERNED OVER SHORT 
FORM LENGTH AND COMPLEXITY 

The Bureau's recently conducted "focus group" studies at 
the Tampa, Los Angeles, and Mississippi pretest sites showed that 
lower income groups generally had questions on, or objections to, 
the housing questions on the short form questionnaire. In 
addition, concerns were expressed that the short form 
questionnaire was too long and complex at the Los Angeles and 
Mississippi sites. 



6 




161 



Focus group studies were generally done to ascertain the 
motivational messages and themes that would encourage 
hard-to-enumerate populations to respond to the census and return 
their forms. The focus qroup approach attempts to develop 
qualitative insight and direction for further research. While 
these three focus groups studies were limited to a total of 95 
participants and the results cannot be projected, they provide 
useful indications of individuals' views of the short form 
gL astionnaire . 

All three studies showed the participants had concern over 
the housing questions. In a Bureau observation memorandum on 
the Tampa study, the plumbing question caused groups to discuss 
why any of the housing questions were needed for the census. In 
the report on the Tampa focus groups, views were reported that 
some of the information requested for the census — e.g., value of 
an individual's home — was "none of the government's business," 
At Mississippi, housing questions such as those concerning 
entrance to living quarters and value of home were considered too 
personal, and participants wanted to know "why do they need to 
know all those other questions" when the census is a count of 
population. At the Los Angeles site, housing questions on 
"entrance to living quarters," "value of property" and "rent" 
were objected to the most. 



7 




162 



In a memorandum of observation on the Los Angeles study, the 
Bureau noted that those queried believed the questions on 
plumbing (which were thought to be "ridiculous" ) , value of 
property and rent were considered to be too personal. According 
to the Los Angeles report, most of the objectionable questions 
related to housing items.* These items were "specifically 
perceived to be a means of tricking people into exposing 
additional unreported individuals who occupy the household." 
Also, the number of rooms question was considered by some to be 
outside the purpose of a head count. One of the Los Angeles 
report's conclusions states that "objections were raised to some 
of the questions contained in the Census short form (particularly 
the housing-related questions) to the effect that their perceived 
true purpose is surreptitious." 

At the Los Angeles and Mississippi sites, the short form's 
length and complexity concerned the participants. In 
Los Angeles, initial reactions to the short form were "it's much 
too much, too ldrng"; "it looks too big. What they're asking, you 
should be able to put it on a 3 x 5 index card." The Bureau's 
observation memorandum on Los Angeles noted that a first reaction 
to the short form was "wow ... too big ... overwhelming." The 
Bureau's observation memorandum on Mississippi . noted that 
people said the "form looked complicated, too long ... too 
nosey." The memorandum further noted that people felt that it 
was not very complicated when they were walked through the 



8 




163 



questionnaire. One representative quotation in the Mississippi 
consultants report was "too much work to filling that out." 

IMPROVED DATA ACCURACY 

The degree of literacy skills affects data accuracy, and 
form completion. Respondents with marginal literacy skills 
should be better able to respond correctly to a simplified and 
reduced short form questionnaire. More people could respond, 
resulting in a more precise population count; and since less data 
would be collected through door-to-door enumeration, enumerator 
bias would be reduced. 

The Bureau has not conducted any literacy nests since the 
1970 's when experiments were conducted to measure respondent 
literacy skills or the reading level required to complete the 
short form questionnaire. Literacy level is an important issue 
in developing a successful questionnaire. As such, the Bureau 
should have gained some insights into problems people have in 
understanding census questions by conducting studies with pretest 
respondents. While realizing that the impact of marginal 
literacy skills on accurate completion of the short form is 
unknown, it seems reasonable to assume that a simpler, reduced 
size, easier-to-read form could only help. 

Moreover, since the collection and processing of data could 
be accomplished more efficiently and effectively using a 

9 




164 



streamlined short form, the population counts could be tabulated 
earlier allowing more lime to review the results. Additionally, 
a streamlined short form would allow the Bureau more flexibility 
in considering data capture equipment. 

Now, I will discuss the subject of data capture equipment. 

BUREAU'S DECISIO N NOT TO USE OMR 

We believe cne Bureau's decision to discontinue 
consideration of OMR eouipment was influenced by its late start 
in detailed planning, reluctance to revise the questionnaire 
form, and a slow procurement process. Whether OMR equipment 
could have been adapted for use during the 1990 census may never 
be known. However, because of its actions, the Bureau has 
excluded an option for using new technoloqy without fully 
exploring its potential. — 

The commercially available OMR equipment considered by the 
Bureau had both advantages and disadvantages for a census. The 
OMR equipment employs a one-step process to read and record the 
data. The Bureau's traditional film to tape data capture system 
called FACT required three sequential processes — filming, film 
development and scanning the film. Also, on the basis of our 
observation of the 1985 pretest, the OMR equipment was easy to 
operate, and training time is short. For some of the FACT 

10 



168 



165 



processes, a knowledge of chemistry and other technical subjects 
are needed. 

On the other hand, the commercially available OMR was 
desiqned to process a single page form much smaller than the 
census short form questionnaire. In addition, the Bureau for the 
past several censuses has used a multipage booklet form for its 
long form questionnaire. The OMR was also designed for use in a 
controlled environment such as qrading test answer sheets where 
the students are provided with #2 pencils. The Bureau's FACT 
equipment is generally not affected by these constraints. 

Despite these known disadvantages, the Bureau decided to use 
existing OMR equipment in its 1985 pretest, and as a result the 
pretest proved very little. Most of the major problems that 
occurred with the OMR in the tist had previously been known. The 
test simply helped to identify the magnitude of these problems. 

Th«J prospective vendor had made several proposals to 
overcome the limitations by designing a modified OMR but the 
procurement effort for a modified OMR was protracted and 
eventually terminated. Thus, no OMR is being used in the t986 
pretest, although testing data capture alternatives was a major 
objective of the test. 



11 




166 

In November 1985, the Bureau decided to terminate its 
consideration of OMR technology as the primary data captur 
technology for the 1990 census. As a result of its decision, the 
Bureau ruled out the possibility of exploring the usefulness 
of this equipment, the new technology being considered for the 
1990 census. Thus, the Bureau limited its options to the 
traditional FACT system and data keying. 

From the beginning, the Bureau was aware that the 
commercially available OMR equipment did not satisfy all existing 
decennial census needs such as paper size and the use of a 
variety of marking instruments. Thus, testing the unmodified 
OMR equipment unless the Bureau was considering revising its 
requirements did not seem prudent. Absent this possibility, the 
Bureau should have formalized its requirements and initiated an 
effort to test a modified optical mark reader early in the 
decade. 

CURRENT DECISIONS AND PLANS ON DATA CAPTURE 

The Bureau's present plans on data capture and processing 
office configuration, as developed in late April, will result in 
a processing operation for most of the nation similar to that 
used in 1980. Our understanding is that the Bureau's FACT 
system will be used as the primary data capture technology. 
Filming of all questionnaires will be performed in 10 to 15 
processing offices. The location of the film processing and 

12 



170 



167 



scanning has not yet been decided. Questionnaires from 
hard-to-enumerate areas (possibly 12 percent of the nation) will 
be sent directly to these processing offices to be captured and 
where they may be automatically edited. 

For the remainder of the nation, questionnaires will be 
returned to district offices, where they will be manually edited, 
similar to the 1980 census. Only after the questionnaires are 
"perfected" will they be sent in batches to the processing 
offices for data capture. In 1980, questionnaires were not sent 
to data capture until all questionnaires from a district were 
reviewed, "perfected" and batched, and the office was closed. 

We have several reservations about the Bureau's late April 
decisions on data capture and the data processing office 
configurations. The Bureau's current plan for data processing is 
a hybrid system which incorporates concurrent data processing for 
a small portion of the nation and a modified 1980 data capture 
for the remainder of the country. On the one hand, we are 
pleased that the Bureau has decided on these basic automation 
activities 5 months earlier than originally planned as we have 
previously advocated, albeit without evaluation information from 
the 1986 pretest. We are also pleased that the Bureau has 
decided against using data keying, a relatively slow, error prone 
and expensive technology as a primary data capture technology. 
On the other hand, we are concerned with the decision to forego 



13 




168 



concurrent processing and revert to manual procedures for most of 
the nation because we believe the Bureau has thus foregone the 
benefits that could be derived from a more automated operation. 

Sufficient details of the plans for the processing 
operations have not been developed to allow us to fully assess 
them. However, on the basis of the information obtained to date, 
we believe that the Bureau has compromised its goals for 
automation and for the census . a whole. 

Because the majority of the questionnaires will not be 
captured until they are manually edited and "perfected," the 
important benefits of concurrent processing will not be obtained. 
Manual processes, particularly editing, will be used. Thus, 
some of the benefits from automation including speiid and 
consistency and accuracy will not be obtained. Instead, a small 
army of temporary employees will probably be used. About 37,000 
clerks were employed in the 1980 census to check returned 
questionnaires for complete and consistent entries. An early 
automated back up file will not be prepared. And premature 
destruction of the questionnaire forms as occurred in the 1980 
census would remain a potential problem. 

Maintaining controls over forms returned may be difficult 
if the questionnaires are sent to the processing office on a 
piecemeal basis. If information, such as population count, is 



14 




169 



not manually recorded from f-.he returned questionnaires in the 
district offices, it also may be difficult to perform some 
coverage improvement programs , such as local review or 
recanvassing neighborhoods . 

Additionally, we believe that the use of manual processes 
in most of the district offices and the use of 10-15 processing 
offices as compared to 3 in the 1980 census will help drive the 
per household cost, exclusive of inflation, of the 1990 census 
beyond the cost of the prior census. 

We intend to monitor the continuing developments on data 
capture and processing office configuration because of the 
importance of the data processing operation in a decennial 
census. 

This concludes my remarks, and I would be happy to respond 
to any questions. 



15 



173 



170 



RESPONSES TO FQLLQWUP QUESTIONS 
MAY 15, 1986 HEARING BEFORE THE 
SUBCOMMITTEE ON CENSUS AND POPULATION 
COMMITTEE ON POST OFFICE AND CIVIL SERVICE 



In yaur testi many , yau recommend that the 
Bureau test a shorter -farm than has been used 
to date. Specifically yau advocate dropping 
mast a-f the housing questions -from the 100 
percent questionnaire. Yau based thi s 

recommendation an a -finding that -federal 
agencies don't use the 100 percent data. We 
have heard that many local governments da use 
this data -far planning and in connection with 
-federal grant applications. 

— Have yau looked into the way local 
government use the 100 percent housing data? 

— What would yau say to a Mayor or City 
Administrator who told yau that they need the 
100 percent hausi ng data in order to -file 
appl i cat? ans -far -federal discretionary 
grants? 

Our work was generally directed to -federal agencies 
because they are major users a-f census data and the 
Census Bureau had obtained justi -f i cat i ans -far its 
questions -from them- In aur work ta date we have done 
very limited work an determining the use made by local 
governments a-f 100 percent housing data. However, an 
the basis a-f reviewing grant legislation and aur 
1 imi ted work an 1 ocal government ' s use a-f census data , 
we believe that data are rarely needed at the black 
level -far grants. Sample data satis-fies data 

requirements at higher geographical levels. 



174 



171 



In responding to a local o-fficial who claimed to need 
100 percent housing data , we would explore wi th the 
o-f-ficial his /her speci-f i c needs and try to debermine 
the validity o-f the o-f-ficial 's contention and whether 
al ter native data might satis-fy his/her requirements. 

At the subcommi tee ' s request we plan to do additional 
work to determine i-f there are valid needs -for 100 
percent housing data at the local level. 
Q. 2. GAO strong 1 y advocates that Census Bureau 



ought to carry out a test o-f the short -f orm. 
What are GAO ' s expectations -from such a test? 

— The return rates o-f the 19B5 and the 
1986 pretests were poor to -f ai r . What return 
rates on the shortened si mpl i -f i ed -f orm woul d 
be acceptabl e standard -for GAO? 



A. GAO advocates the use o-f a streamlined short -form to 
determine if it can achieve several advantages 
including a greater mail response rate, more complete 
quest i onnai res , a hi gher qual ity o-f data, possibly more 
cost e-f-ficient data processing options, and a reduced 
respondent burden. A greater mail response rate, less 
incomplete questionnaires and a more cost ef-ficient 
data process i ng opt i on , all shoul d resul t in a 1 ess 
costly census. The Bureau's estimate o-f savings of $6 
• million -for each 1-percent increase in mail response, 



basically because o-f reduced -follow-up, is a good 
criteria in measuring the value o-f a streamlined -form. 
Thus a 2-panel test, comparing the mail response -from a 



2 




172 



regular short -form to a streamlined form, would go a 
long way in determining the value of a streamlined 
form. In addition such a test could be used to compare 
the number of questionnaires that do not meet the 
Bureau's criteria of completeness. The Bureau follows 
up on incomplete questionnaires to collect the missing 
data. 



Q. 



Besides the cost of doing follow-up work, the 
interaction of a Bureau temporary employee injects a 
factor of enumerator bias into the data collected. 
This affects data quality. Moreover, removing the 
housing questions from the short form far the 199o 
census would result in 85 million households answering 
ei ght fewer questi ons. T hi s waul d reduce the 

respondent burden . 

; . For the past several decades, the census 

planners have assumed that while there is a 
limit beyond which the census form could not 
go, generally up to that limit, the basic 
cost of getting to the household and getting 
the form back was not affected by the length 
of the form. Today you have testified that 
this assumption is wrong. Could you explain 
the evidence for your assertions? 

What speci f ical ly 1 eads you to 
believe that a very short form will greatly ' 
improve the response rate and reduce the cost 
of the census? 

In the absence of a specific test, we do not know the 

extent to which a streamlined form will increase the 

mail response rate. However, we have evidence that 

3 



176 



173 



indicates there would be an increase in the rate. We 
have examined the mail response rates -from the censuses 
in 1970 and 1980 and pretests in 1985 and 1986. They 
have indicated that a higher response rate is obtained 
•from the short -form compared to the long -form. For 
example, in the 1985 and 1986 pretests there was a 6 to 
10 percentage points higher response rate to the short 
■form compared to the long -form. The di-f-ference in 

response is particularly evident in the hard-to- 
enumerate areas- We have also reviewed results -from 
the -focus group studies in the 1986 tests. 
Part i ci pants in those stud i es , mai nl y -from 1 ower income 
groups, have questioned the need -for data aside -from 
in-formation needed -for the count. As discussed above 
estimated costs can be attributed to -fol low-up work. 
In addition more cost e-f-ficient processing methods can 
be considered. 

Q. 4. What are same data capture techniques the 

Census Bureau aught ta look into even now to 
prepare -for the 2000 Decennial? 

A. We bel i eve that data capture should be closely 

integrated with the data to be collected. The amount o-f 

data to be collected and the type o-f in-farmation suich 

as numbers and names have a signi-ficant bearing an the 

data capture technol ogy. Passible data capture 

techniques to be considered -for the 2000 decennial 

include data imagery, a relatively new technology, 

4 



177 



174 



hand-held personal computers, and direct telephone 

imput to computers. These latter two methods might 

hel p pave the way for a paper 1 ess census. 

Q- 5. Please list some of your recommendations to 

enhance long term planning cycle far 
decennial censuses. 

.A. In order to improve the Census Bureau decennial 

planning cycle the following could help 

— a long range or strategic planning group 

shoul d become a reality. 

— a strong research and experimental program 

be i nc I uded in the current decennial 
census , and 

— a permanent core group with sufficient 
authority be maintained in the Census 
Bureau to provide continuing direction for 
decennial planning activities. 



178 



175 



Mr. Garcia. Thank you, Mr. Dodaro. 

What I would like to do is to ask the Bureau of the Census their 
views. Just let me ask this one question first to the Census Bureau. 
We are going to hear from the National Computer Systems repre- 
sentative that they are prepared to process a form that had more 
than one page. What they are going to do was to slit the form, 
process it, and then reassemble it. This is what they already do for 
a number of OMR documents that they process. Why was it unac- 
ceptable to the Bureau of the Census? 

We are going to hear from Mr. Gail Frank? in just a little while, 
but I would like first to get a response from the Bureau. 

Ms. Miskura. Mr. Chairman, controlling KfO million forms, of 
which 20 million are likely to be long forms, is a massive operation, 
even under the best of conditions. Requiring additional operations 
to slit, process, and reassemble the forms wouW add to the risks, in 
terms of timing, costs, and the accuracy of the data. 

Small problems with missing forms or pages .stuck to each other 
could cause us major problems. The recent proce ssing decision that 
we made will utilize the cameras — the page-turning cameras — that 
we had, like in 1980, which performed very well for us, and which 
did not require us to take the booklets apart. 

We understand that NCS has that capability, but we think ws 
have minimized our problems overall and v/il! be able to control 
the Census much better using tae FOSDIC cameras. 

Mr. Garcia. You have heard GAO testify. There are a number of 
areas in which they have some, I believe, constructive criticism. 

I guess the first one is the question of the cost. 

Do you have a problem with what they said? 

Ms. Miskura. We have considered cost in all of our developmen- 
tal w.ork and our decisions about automation for the 1990 census. 
We will only use those automation techniques that we feel will be 
cost-effective or improve the accuracy and timeliness of the Census. 

While it is very difficult to attach an estimate of savings to an 
individual automation decision, we do believe that the overall 
system that is designed will be most cost-effective. 

We will be using a number of automation techniques that we cer- 
tainly feel will be cost-effective. In particular, the automated geo- 
graphic support system, or TIGER system; the automated address- 
control file; an automated check-in, which will be used throughout 
the country; automated edits in some parts of the country; and the 
earlier data capture, much earlier than in 1980, for the whole 
country also; as well as our automated management information 
system. 

Mr. Garcia. Yes; as you know, J .:haired the Census Subcommit- 
tee from 1979 to 1982, and I remember going back to the Appro- 
priations Committee for additional funding to complete the 1980 
census. We were assured that the costs of the 1980 census would be 
contained and we would not have to do that. But as it turns out, we 
had to do that. 

You know what happened just recently with all the publicity on 
the escalating costs of the space shuttle. They used all sorts of for- 
mulas to say What they said because they are going to have more 
space shuttles taking off. They said that the large numbers will 
more than compensate for the cost. I just think that we are living 




176 



in a world today, and its true especially here in Washington, where 
we need to call things the way they really are and not the way you 
want us to hear them. 

It just seems to me that it is going to be very difficult, I believe, 
in 1990 to go back to the well. 

Why don't we just start from the beginning and really say that 
there may be some overruns, and deal with it in that fashion? 

Mr. Bounpane. Mr. Chairman, I would like to say a few words 
about that because I think that you have stated the problem cor- 
rectly. 

If you take a look at the census costs from 1950 on, and remove 
inflation from them to put them on a common base, the cost per 
unit has gone up, census by census by census. 

That, I think, comes from two reasons, the first being that the 
census changes over time. We try to do a better job with each 
census. 

The second reason is that it gets harder to take the census over 
time in our complex society. And so, yes — to then say that that line 
of increase is suddenly going to stop and level off between 1980 and 
1990 is an extremely ambitious goal. 

In trying to cost the 1990 census we tried to seek the appropriate 
balance between asking for funds in a very restricted budget envi- 
ronment and seeing what we could do about modernizing the 
census to make savings in some places to pay for increases that 
might exist somewhere else. 

I think you are correct when you say that there are possibilities 
that we are not going to be able to meet ihat goal, and some of us 
have concerns about that as well. 

What we want to do now is try to produce the cost of the census 
in a much more detailed manner, now that we have made some 
very basic decisions, to see what we think that total cost will be. If 
it is more than the stated goal we will come to the Congress and 
ask for those funds in advance. Because as you pointed out, the sit- 
uation we experienced in 1980 was not a good one. We, like you, do 
not want to experience that, in terms of coming in for additional 
moneys at the 11th hour and all the associated problems with that. 

One potential way to handle that is to have some kind of reserve 
set aside for unexpected difficulties. For example: Budget the 
census at an 80 percent mail return rate, but have a reserve set of 
funds available such that if, for son?e reason, the mail return rate 
in 1990 should come in at 75 percent, there are moneys available to 
attack that problem. 

If I could add one more thing, though, about automating the 
census, and the cost of it. It is not necessarily true that automation 
saves money in a census environment. It is very true in an ongoing 
operation. If you can automate routine tasks, that should save 
money over time, because you have the chance to amortize huge 
investment costs of equipment. 

Unfortunately, in a census the opportunities to do that are rela- 
tively limited. You have to pay a large amount of money to develop 
machinery and to purchase it. You use it only for a very short 
amount of time. And though it may help you save money by elimi- 
nating a labor-intensive task, on the other side of the scale is the 



180 



177 



huge amount of money you had to spend to develop and procure it. 
And those do not necessarily balance. 

To me, the real advantage of automation comes in better control 
and better quality, which is important; not necessarily lesser 
money. Sometimes those facts get a little bit confused, and I 
thought it was worth putting that on the table. 

Mr. Garcia. I see that Mr. Dodaro would like to respond to that. 
It is very important that we work together on this. It is good to 
have somebody looking over your shoulders. Constructive criticism 
is going to benefit us in the long run. 

Mr. Dodaro. 

Mr Dodaro. Mr. Chairman, I think Mr. Bounpane makes a 
number of excellent points. But I would like to point out that the 
household cost between the 1970 and the 1980 census more than 
doubled, even after excluding the increase in inflation. 

I think that it presents a number of challenges to the Bureau, 
and will not be achieved without taking some risks and challenging 
the status quo, and may be more than the Bureau has been willing 
to do in the past. This is somewhat understandable, however, since 
it is a large operation with a lot of components. 

There is a long lead time for the 1990 census, and the key is to 
maximize the use of that time to carefully explore some of these 
options. The time has not been used properly in the past, to the 
best advantage. Some of the testimony that you heard on May 1, 
from the academic community, and some of the testimony that you 
are probably likely to hear from NCS is along the same lines. 

While it has not been conclusively proven, as Mr, Bounpane 
points out, that automation will save money, it has been proven 
over time that reducing the number of people needed to take the 
census, particularly the training associated with bringing in thou- 
sands upon thousands of temporary employees, would be cost effec- 
tive. 

The direction that we need to move in is the one that you are 
pushing the Bureau in — is to really challenge the status quo and to 
be more aggressive in pursuing some of these options, to try to take 
hold of the situation and be more assertive. 

Mr. Garcia. Just let me say to the panel from the Bureau of the 
Census that — I think we have to be a little imaginative. Yes, we do 
have the census that comes at the beginning of each decade. Why 
can't we, during the course of the other 6 or 7 years be imaginative 
and go out and market that system. And in-house, use that system 
for other r^encies or whatever the case might be. Even going out 
into the pnvate sector. It seems to me that the Bureau of the 
Census, which I have tremendous respect for, and which, I believe, 
most people who know your function have respect for, would be 
only too happy to work with you in a way in which you could be 
out marketing your product. The current system that you have 
there, which you are concerned about, would have a tremendous 
cost overrun. 

We have to be a little imaginative. And I think it would be very 
productive. Go ahead Peter. 

Mr. Bounpane. Is it OK to add something here 

Mr. Garcia. Yes. 




178 



Mr. Bounpane. I think Gene made a good point. Many of us wish 
we had, perhaps, done a little more testing in the 1980 census to 
help us toward 1990. At this point in time we can't change that, 
but we certainly can learn from that and say "test more in 1990 to 
learn about the future so we are not faced with this problem come 
1993 or 1994." That is an excellent point, and I think we hear that 
clearly. 

Also, I heard what you said, Mr. Chairman, about being a little 
more imaginative about things and I think we can take that back 
and see what opportunities we have for that, as well. 

I would like to point out, however, that automating the census — 
we always talk about it with regard to the capture of the informa- 
tion from the questionnaire into the computer. That is only one 
part of the total census process. There are many others, and we 
have, I think, been imaginative in those areas. The program we 
have for automating the maps is a major advance over what we did 
in 1980 and is in direct response to a problem that existed in 1980. 

What we have already accomplished in terms of automating the 
address control file, and allowing for automated check-in through 
bar codes, is again, a direct response to a problem that occurred in 
1980, where that was done manually, and using computers to solve 
a problem that existed then. 

The other point I would like to make is that for all of its nega- 
tive appearance of being technology that's been around for a while, 
the FOSDIC system in 1980 worked very well. The field operations 
lasted some 2 to 3 months, in some cases 4 months, longer than an- 
ticipated in 1980. And yet, we still had to meet the date of produc- 
ing the counts to the President, hy December 31, the last day of the 
year. 

The FOSDIC system made up that time difference. That is, we 
were able to read all 88 million questionnaires in 1980 through 
that system in several, fewer months than we had originally 
planned. Now, I do not say that that makes it better than anything 
else that ever was, but it did operate well in & difficult environ- 
ment. And that is a hard bit of experience to overlook. Notwith- 
standing the need to, perhaps, move forward where we can find 
better ways to do things* 

So, I think we have Moved on some things, particularly where 
there were problems in J.98C but, perhaps, to some people's point of 
view, not as far as we could have on s^me others. 

Mr. Garcia. There is a vote on the floor, and it is an important 
vote, so what I would like to do is just call a quick recess. I'll run 
right over and I II coxne right back. 

[Recess taken.] 

Mr. Garcia. Peter, I think there were some other parts of that 
slide presentation by GAO on which I saw you taking notes. Per- 
haps you would like to try to respond to some of the thoughts of 
the GAO. 

Mr. Bounpane. Mr. Chairman, Susan will make some and then I 
will do some others, OK? 
Mr. Garcia. Fine. 

Ms. Miskuua. I think we would like to respond particularly to 
the next-steps portion that Gene concluded with. Regarding the re- 
quirement requiring testing of a short short form, in effect, the fea- 



182 



179 



sibility of doing that really is dependent on I ,3ed for the data. 
Particularly, GAO has expressed concent ab«, the housing ques- 
tions. 

If we felt that they were not necessarv f .00-percent basis, we 
would take them off of the form, and it reany would not be neces- 
sary to test that. 

The reason those questions are included on the form is because 
in our continuing conversations with a broad variety of data users, 
we are being asked, and they are justifying the requirements for 
those data at small area levels. 

We also feel that a number of the housing questions are neces- 
sary to assure coverage both of the population and of the housing 
inventory. 

We realize that GAO's discussions with some Federal-level users 
have come up with information that is inconsistent with the infor- 
mation that we have collected. We are planning, on the basis of the 
information they obtained, to go back and talk to the members of 
the housing interagency working group at the Federal level, and 
also representatives of State and local governments to have them 
reassess their need for the 100-percent housing data. 

Mr. Garcia. Just let me interject at this point, and it relates to 
my meeting yesterday with GAO. I have some problems with some 
aspects of the GAO s criticism because we have a special need 
today for the housing data the GAO proposes to eliminate. Since 
1981 we have had a steady decrease of construction of housing for 
the poor and for the lower income people. Because of that there 
have been many instances in which, I know for a fact in my city, 
there have been families who have been doubling up. 

Now, it is based on that that I have some problems with the 
GAO's criticism because we need that data. There is no question 
we need some of that data. 

There may be some census data that are questionable, but in 
terms of the number of persons who are residing within a particu- 
lar apartment, as in the case of public housing in New York, I 
think that the housing data is absolutely essential. And it just 
seems to me that we are going to have a ve^ difficult time getting 
that information anyway. Many of the people are not going to will- 
ingly come forward to volunteer information that there is an uncle 
or an aunt, with children, living there. These people have no other 
place to go so they have to doubleup. 

So it is based on that that I feel that the housing questions- 
some aspects of it — ar e extremely important. 

I think counsel would Viie to ask a question at this point. 

Ms. Fernandez. This question is directed to the Bureau. One of 
the concerns that we have, regarding the use of the short form, is 
particular to the housing questions. But it appears from the testi- 
mony, though you talk about the respondent burden, that it may 
not necessarily be so, as GAO suggests, that there be some respond- 
ent burden. It seems as though there is a tapering off of responses 
within a particular timeframe of the questionnaire. 

For example, someone may start response but at the tail end, 
which are the housing questions, are not responded to. And that 
generates the additional cost of sending in an enumerator to get 
that information. 



183 



180 



Have you looked to alternatives? For example, if the housing 
questions were removed from the short form, to increasing the 
survey, the 20-percent form additional housing questions where the 
quality of the data, the release of the data, would be earlier. Would 
that not offset some of the concerns that the data would be— that 
the questions be asked on the 100-percent count? 

Have those other options been explored by the Bureau? 

Mr. Bounpane. Yes. If I could answer that question plus some- 
thing that Mr. Dodaro said earlier that sort of confused me — I 
think they are related. 

Some of the questions, the housing questions, are needed for very 
direct reasons, like defining "is this a housing unit, or not"; and 
secondly, for some very basic planning that is necessary to know 
what a city should do, in terms of improving the situation and 
making changes. 

We can argue about whether it is four questions that are really 
needed, or six questions that are really needed, or eight questions 
that are really needed. We can have that discussion, but it is about 
the number of questions, not the validity of the data need. 

One thing that sort of surprised me, and maybe I misheard the 
GAO this morning, was a recommendation that all eight could be 
moved to the sample. We have never investigated that, moving 
every housing question to the sample questionnaire. Surely some 
are required on a 100-percent basis. 

Some of them could be moved, I think. Not all eight, but I think 
we can take a good look at two or three of those to see whether or 
not they really are required of everyone, or if we could do just as 
you suggested, put them on the sample form, so that they would 
only be asked of 20 percent of the people. 

I would only like to point out one thing about that, and that is 
looking at error rates from 1980, that is, the number of q ■ itions 
that were not answered that should have be?o answers. That 
error rate is worse on the long form than it is ou t!:e short iorm. 
That follows logic, of course. You would expect ihht. So ^hat the 
more questions you put on the sample, the mofp likely you are 
going to get a nonresponse which could, in fa^\ ir^*, v-v Held work 
anyway. 

As Susan was pointing cv ? che key issue is the need for the in- 
formation. If it ip true .vukhI, znd a true need in a small area like a 
block, then there is no alterative 7 : ' asking it of f veryone. 

If the need is not for ■?**■■. ■::i&ll g^ogva^hic area, ut could satis- 
fy for just the city a c . r. -wLout, or a State. v>».cn surely asking it on a 
sample is more appropriate. 

I think there are, perhaps, two or three of those housing ques- 
tions that we should look at carefully again, to solve this bit of a 
dilemma that GAO's request to tho Federal agency got a different 
answer than our request to the Federal agency. We have to solve 
that and say who is right. And if it is really not needed on a block 
area level then certainly moving it to Cue sample is something that 
should get serious, serious consideration. 

Ms. Fernandez. I know in previous hearings we talked about the 
questionnaire content and looked at the need for data. GAO in its 
testimony is raising questions not only regarding the need for the 
data on the 100-percent basis but alst what data, in fact, the agen- 




181 



cies are receiving. It appears that the agencies are getting survey 
information, sample information, and that appears to be adequate 
for the Federal needs. The GAO analysis is limited to a review of 
Federal users, excluding State and private data users. 

Has the Census Bureau evssr done an, evaluation? I know at other 
hearings you responded with Federal legislative requirements of 
data users and other information like t'hte— but the actual evalua- 
tion of where the need is and how that need correlates to the level 
of information — at the block level, the tract level, the county 
level— and done a thorough study on that? 

Has the Bureau done that recently? Or at any point? 

Mr. Bounpane. Not systematically, to my knowledge, at all. 

Mr. Dodaro. That is an excellent point that has been raised. 
There is a question of need but there is also a question of what is 
the most cost-effective way to meet that need; and what are some 
of the other options that could be available to satisfy that need 
rather than asking it of every household in the questionnaire. 

I will ask Mr. Donoghue to elaborate a little bit further, but in 
terms of the points that Mr. Bounpane was confused on, I would 
like to eliminate some of that confusion. The questions that he is 
talking about, that are needed for coverage purposes and defining 
the housing unit, are ones we believe should remain on the form. It 
is the other questions that we think could be taken off. 

We found there is a distinction between justification or desire for 
the information and what is actually used and the knowledge of 
the people from what they are actually using. Some people believed 
they were using 100 percent information, but in reality it was the 
public-use tapes from the lonjr-form questionnaire. 

So I think it really requires some probing and study to get 
behind some of the questions in terms of how they are using the 
information and what are other ways that those needs could be 
met vnthout asking every household in the country to respond to 
how much rent they are paying, for example. These are the areas 
that we are suggesting that ought to be revisited and reexamined. 

Ms. Fernandez. One of the things that concerns us, and we dis- 
cussed this previously in regard to the GAG proposal for the tdiort 
form, is that moving the housing question vknild appear to make 
it more difficult for data users to look ex /ery specific characteris- 
tics of the communities, vis-a-vis their housing needs. 

For example, if you are looking at the <?ed population, because 
the data is not in the 100-percent count, it may be difficult to deter- 
mine what their specific housing needs are in particular States or 
counties. 

Has the General Accounting Office explored what the loss of the 
data would be, or whether those characteristics across the Nation 
can be maintained by sample survey form? 

Mr. Dodaro. I would like to have Mr. Donoghue elaborate on 
this but I will give it a general stab first. 

Many of those characteristics of information are embodied in a 
lot more detail on the long form questionnaire, which could be 
sampled. And I would like to, also, reiterate the point that it is a 
20-percent sample which is used to make estimates at the county 
and track level, which are pretty good-sized levels and it is a good- 
sized sample of the country. 




61-902 0 - 



86 



182 



Mr. Donoghue. I think Ms. Miskura and Mr. Bounpane could 
probably respond better toward the loss of precision, but certainly 
at the State and county level the estimates from sample data 
would be sufficient to meet most State and local user needs. 

What we are talking about here is how many people have to 
have the data at the block level, and have to use that data to make 
decisions. 

The other aspect of our recommendations would not impact the 
size of the long form since all the questions on the short form are 
already on the long form, so if you took them off the short form the 
long form would still be the same size. 

Ms. Fernandez. Just one other point. The General Accounting 
Office has proposed short-form testing. At previous hearings, the 
Bureau has stated that the New Jersey pretest, as applied, was a 
test of the short form. The GAO says that it was not. 

Would the Census Bureau be amenable to testing a short form as 
suggested by GAO without the housing questions, either in a pre- 
test or in some other mechanism to test whether or not, as they 
suggest, the response rate would increase? 

Mr. Bounpane. I think we can certainly look at that and see if 
there is an opportunity to try to test that, keeping iii mind that if 
the information is really not needed at small areas, then testing 
that would give you some information. But I do not know what you 
would do with it if you still had to produce that information for 
small areas. 

It seems to me that the first question has got to be answered, 
and we would certainly support some kind of an investigation into 
what are the true uses of the information from the census and 
whether or not all of those housing questions really are needed for 
small geographic areas. We have not made that kind of a study in 
depth. 

Mr. Dodaro. We would support that. I think Mr. Bounpane is 
right. You have to weigh the needs of the information against the 
benefits of it. But we would support doing both, so you have both 
pieces of information available. - 

In other words, pursuing the need question, keeping in mind op- 
tions that could be considered to meet those needs at less cost. But 
also testing whether or not you do, in fact;, get any benefits from 
increased mail response rates by using the shorter form. Then, 
having those two pieces of information, you will be able to weigh 
whether or not the benefits of increased mail response rates are 
worth the degree of precision and the need for some of the informa- 
tion. 

Ms. Fernandez. It also may open up more automation options 
because the form would be shortened. 

Mr. Bounpane. Let me just make one point about the sample 
size that both the chairman and you mentioned, because I think it 
is important. 

Yes, 20 percent is a large sample by most standards, and it is 
easy to say that is a big enough sample size for information by 
tract or by county, but that is if you are talking about the entire 
universe. 

Let me try to give you an example. Suppose you axe interested in 
information about single, elderly women who rent their unit. Now, 




183 



it is conceivable that a city would want information on how many 
women over 75 live alone and rent their unit. And they might 
want that information by neighborhood, not solely for the city as a 
whole. It seems to me a reasonable piece of information to want to 
have. 

Now, at the present time you could get that when collected on a 
100-percent basis. Alright? But if you asked renter and owner on a 
sample, rather than 100-percent, you can't get that information by 
very small areas. Alright? And you start to lose information very 
quickly for subsets of the population. 

It is important to remember that, as well, when discussing the 
sample-size argument. 

Mr. Garcia. That is an excellent point. 

I think what we have to do here is that there are many local gov- 
ernments that are going to need this information. There is no ques- 
tion about it. But I think that some of the criticisms that have 
been stated by the GAO deserve consideration by the Census 
Bureau. If you could put your heads together, I think we may come 
out with some sense of getting data that we could ail live with, and 
from which we would get maximum use out of for the next decade, 
for the decade of the 1990's. 

There are several other questions that have to be asked, but in 
fairness to our next witness Mr. Gail Franke, this panel will shop 
here. Before Mr. Franke comes on I would like to thank both the 
GAO and the Bureau of the Census for their participation. 

Now, I would ask you to stay, because we may need you later on 
for questions and answers. Mr. Gail Franke, who is tho vice presi- 
dent of Federal Government marketing, National Computer Sys- 
tems, is going to testify now. There may be something which we 
will ask all of you when he finishes testifying. 

STATEMENT OF GAIL FRANKE, VICE PRESIDENT, FEDERAL 
GOVERNMENT MARKETING, NATIONAL COMPUTER SYSTEMS 

Mr. Franke. Mr. Chairman, on my left is Mr. Robert Roelf. He is 
NCS technical project manager for census-related activities. And 
on my right is Mr. Jeffrey Goldberg, NCS program manager for the 
Bureau of the Census. 

I must say I was relieved to see you excuse the previous wit- 
nesses, Mr. Chairman. Between GAO and the Bureau is not exactly 
the place I wanted to be and that appeared to be the only open 
chair, so. [Laughter.] 

This is much more comfortable, with all due respect to my good 
friends from both those agencies. 

On behalf of myself and National Computer Systems we appreci- 
ate the opportunity to appear before your subcommittee today. 

I would like to offer some summary comments on the material 
previously submitted to the subcommittee and these comments will 
address four issues. 

First, r brief, but I believe pertinent, statement describing NCS, 
the corporation, which is to fundamentally establish our creden- 
tials in the optical mark reading arena. 



197 



184 



Second, an abbreviated description of our involvement in the 
OMR evaluation process conducted by the Bureau, and our assess- 
ment of the results that obtained therefrom. 

Third, our observation of the Bureau's overall planning process. 

And last, some general recommendations relative to a research 
and development program which we feel the Bureau should imple- 
ment immediately. 

To get to the first of those issues, since 1962 National Computer 
Systems, through its OMR systems, has been providing technologi- 
cal solutions for the large scale data collection needs of education, 
government, and industry. As a corporation, NCS in the fiscal year 
just ended generated revenues in excess of $215 million, and main- 
tains a work force of nearly 2,000 employees nationwide. 

While OMR systems were first pioneered in the area of educa- 
tional measurement, they have since been successfully used for 
many other large scale data collection tasks including surveys and 
assessments, application processing, health care reporting, and 
order entry, to name just a few. 

In addition to the manufacture of OMR systems, NCS is also one 
of the largest processors of OMR scannable documents in the 
world. Receiving annually over 130 million scannable forms at its 
Iowa City, IA, facility, NCS has established industry standards for 
the logistics management of large volume data collection projects. 

NCS's experience as a supplier of systems and services to the 
Federal Government has also oeen extensive. For instance, we cur- 
rently process approximately 7 million applications for student fi- 
nancial aid through our information services division, and we pro- 
vide distributed, as well as centralized, OMR data entry systems to 
many agencies of the Federal Government. NCS currently has over 
1,000 such OMR systems installed in various Federal installations 
around the world. 

Finally, we have transported this technology to foreign markets. 
In the international sector, NCS has again served the complex data 
collection needs of a variety of areas. 

It was in the international arena that NCS gained its initial ex- 
perience in supporting census data collection. NCS OMR systems 
have been used to conduct the population censuses of Venezuela, 
Chile, and the Dominican Republic, and the Mexican agricultural 
census. The experience gained in supporting international census 
data collection has served to educate NCS to the special require- 
ments of censv processing. 

It was during the period of time when NCS was supporting for- 
eign census programs, that we engaged in initial discussions with 
the U.S. Census Bureau, regarding the possible role OMR could 
play in processing a U.S. decennial census. Between 1980 and 1984 
several delegations of Census officials traveled to NCS facilities to 
view NCS OMR systems and talk to NCS engineers regarding the 
application of that technology in the 1990 census. 

During this same period, NCS personnel toured various Census 
facilities in order to gain a more detailed understanding of U.S. 
Census requirements for 1990, and to understand more fully the 
Bureau's resident data capture technology, FOSDIC. 

In 1984, the Bureau acquired an NCS OMR system, the W-201 
for testing and evaluation. In making the commitment to test this 



188 



185 



system, we believe the Bureau recognized two of the major benefits 
that we feel can be derived from OMR processing. The economy 
and efficiency of direct source document data capture, and the 
transportability of this technology. 

NCS worked very closely with the Bureau on this evaluation pro- 
gram, focusing on the Tampa pretest to be conducted in March 
1985. Two significant issues surfaced very early on in this activity. 

First, there was the matter of the form. As indicated earlier, 
NCS proposed a four-page booklet for the census pretest form, 
which would be slit for scanning, and if desired, automatically reas- 
sembled after scanning; a methodology used by several major OMR 
processors. 

The Bureau disagreed with this proposal and directed us to ac- 
commodate all short-form data on a single SMj-by-ll sheet. The 
result was an admittedly dense document which the Bureau and 
NCS agreed would have to be larger and more open in future us'. 

The second major issue was the No. 2 pencil, the industry stand- 
ard OMR marking instrument. While it was marginally practical 
to insert and mail pencils for the Tampa pretest, some 120,000-odd 
households, it was clear that a much wider range of marking in- 
strument would havo to be scannable in order for OMR to be used 
as the primary data capture technology in 1990. 

In January 1985, and as a direct result of preparing for the 
Tampa pretest, we submitted an unsolicited proposal to enhance 
standard NCS technology to produce an OMR system which we be- 
lieved, and were prepared to demonstrate, would meet the census- 
specific requirement. 

This proposal addressed the need for a larger form, the use of a 
wide range of marking instruments, and several other significant 
enhancements. This enhanced system was to have been delivered 
in two phase??. The initial system, including the larger form capa- 
bility, in January 1986, for use in the Los Angeles pretest; and the 
remaining enhanced features in September 1986, consistent with 
the Bureau's then operative schedule for technological evaluation. 

The Bureau, after evaluating NCS's proposal, and reviewing the 
results of the Tampa pretest, concluded that it was beneficial to 
continue the OMR evaluation process. Accordingly, in June 1985 
they announced their intention to contract with NCS for the acqui- 
sition of an enhanced system. 

However, after having received some level of response from some 
other interested parties, the Bureau reversed their decision and an- 
nounced that they instead intended to conduct a cor *otitive pro- 
curement. 

In September 1985, the Bureau released a draft s .on for 

comment. NCS submitted comments, having been u«o-.ed of a 
prompt release of the RFP. However, in early November 1985 NCS 
was informed that no RFP would be forthcoming, and furthermore, 
that all OMR evaluation, as related Id the 1990 census had, in fact, 
been discontinued. 

The Bureau gave three reasons for this action. 

First, on the basis of the Bureau's assumptions, the Bureau's own 
FACT 90 system had a more favorable cost-benefit ratio than did 
the OMR solution. 



189 



186 



Second, the Bureau believed it had run out of time, because in 
order for the Bureau to manufacture sufficient FACT 90's they 
could not wait until the fall of 1986 for the decision, as per the 
original schedule. 

And third, the Bureau took the position that the enhanced OMR 
system they would have tested in 1986 would not necessarily have 
convinced them that OMR could support the 1990 census. 

NCS, of course, took exception to the Bureau's stated reasons for 
suspending the program. We still continue to believe that the 
direct data capture capability, coupled with the ability for wider 
deployment, serves to more cost effectively support the dual roles 
of concurrent processing and accelerated turnaround. OMR is a 
much simpler process where in most instances should equate to 
less cost. 

With regard to the time constraints, NCS felt that there was 
then ample time to conclude a substantive and complete evaluation 
of OMR and to meet lie production requirements for 1990, in that 
the system NCS proposed to deliver in 1986 would havo given the 
Bureau the facility to thoroughly test the entire enhanced OMR 
system. 

In stepping back from the specifics of the Bureau's OMR Tech- 
nology Evaluation Program, there are several observations that we 
feel we can make. 

The Bureau has a long and proud history as a pioneer in the use 
of advanced technology and innovative methods. Recently, howev- 
er, the Bureau, it appears, has had more difficulty making signifi- 
cant technological advancements. 

In planning and implementing a decennial census, the Bureau is 
continually having to make decisions along a risk versus opportuni- 
ty continuum. While the stakes are high in conducting a decennial 
census, it is our observation that the Bureau primarily tries to 
minimize risk, at the expense of seizing the opportunity to create 
fundamental, technological improvements in decennial census oper- 
ations. 

Part of the problem can be traced to the Bureau's planning proc- 
ess. It appears that the Bureau waits too long between decennial 
censuses to begin the process of identifying, evaluating, and insert- 
ing new technology into its operations. 

Consequently, as census day approaches, the Bureau has tended 
to fall back on procedures and systems used in previous censuses 
because sufficient time no longer remains to successfully test, pro- 
cure and implement innovative new automation systems. 

The Census Bureau seems to be repeatedly caught in a continu- 
ous cyck jf developing plans to move forward with its decennial 
automation only to be forced to fall back on existing data process- 
ing systems. 

This is a problem that has historical perspective in that the 
Bureau has consistently tended to rely heavily on the use of its in- 
house capabilities for the development of any new or innovative 
methods or systems. 

Early in the Bureau's history this was done out of necessity, due 
to the lack of commercially available alternatives. Today, however, 
the commercial sector provides a broad range of products and serv- 
ices which should be of dir°ct benefit to the Census Bureau. 




187 



Finally, where should the Bureau be going for the 2000 decennial 
census? It is very clear to NCS, as it has been to many others who 
have testified before this subcommittee, that planning for the year 
2000 should have begun yesterday. 

The Bureau should implement a program immediately to devc. p 
a qualitatively superior data capture system. Such a system rrnst 
combine the best features of currently available technologies, in- 
cluding FXDSDIC, OMR, OCR, image capture, optical disc technolo- 
gy, online microfilming, and multipage booklet processing. 

Additionally, the design philosophy of such a program must be 
flexible enough to accommodate applicable emerging technologies. 
The goal of this R&D program should be to develop a Census data 
capture system which would exceed all that is currently available 
when viewed from the perspective of cost effectiveness. 

We believe very strongly that this project must be a collaborative 
effort wherein the best resources of the Bureau work closely with 
experts drawn from external organizations. 

Additionally, the Bureau should continue examining and analyz- 
ing all emerging technologies, with specific attention directed to 
those technologies which could ultimately provide a truly electron- 
ic data capture capability. 

Finally, we wish to reemphasize the importance of beginning this 
project immediately, with the goal of having a prototype system, or 
systems, ready for live tests during the 1990 census. 

For this goal to be realized, however, the Bureau must make the 
R&D program of the highest priority. It is our opinion that this 
committee, through hearings such as this, can play a positive and 
supportive role in assisting the Bureau in achieving these goals. 

That concludes my comments, Mr. Chairman, and we, too, stand 
ready to answer any questions you might have. 

[The statement of Mr. Franke and his response to written ques- 
tions follow:] 



1,91 



188 



Testimony of Hr. Gail A. pranke 
Vice Preoident of Pederal Covernment 
Operationo, National Computer Syotems 
Presented to the House Post 
Office and Civil Service committee, 
Subcommittee on Census and Population 



May 15, 1986 



Mr. Chairman, on behalf of myself and National computer Systemo (NCS), we 
appreciate the opportunity to appear before your subcommittee today. I am 
accompanied by Mr. Jeffrey Goldberg, NCS's Program M&nager for the Bureau of 
the Census, and Mr. Robert Roelf, NCS's Project Manager for Cm uo-related 
activities. 



Since 1962, National Computer Systems has been providing technological 
oolutions for the large scale data collection needs of education, government 
and industry. As a corporation, National computer Syotems, in the fiscal year 
just ended, generated revenues in excess of $215 million, and maintained a 
workforce of nearly 2,000 employees nationwide. National Computer Systems is 
the world's largest supplier of Optical Mark Reading Systems. 

An Optical Mark Reader (OMR) directly converts marks made on preprinted 
paper forms into machine readable computerised data. Over the last 24 years, 
NCS has supplied its OMR systems to a broad range of users. The earliest 
applications of Optical Mark Reading occurred in the field of educational 
measurement. Optical scanners have been used in this area for over 20 years 
to provide high speed scoring of standardized educational tests, as the 



192 



189 



2 



educational and career options of millions of students have depended on the 
results of standardized tests, it has been critical from the outset that OMR 
systems provide ! Igh speed, high volume test scoring and, at the same time, 
maintain the highest standards of accuracy. While OMR systems were first 
pioneered in the area of education, they have since been successfully used for 
many other large scale data collection tasks including survey research, 
application processing, health care reporting and order entry, to name just a 
few. In addition to the manufacturer of OMR systems, NCS is one of the 
largest processors of OMR scannable documents in the world. Receiving 
annually over 130,000,000 scannable forms at its Iowa City, Iowa facility, NCS 
has established industry standards for the logistics management of large 
volume data collection projects. 

NCS's experience as a supplier of systems and services to the federal 
government has also been extensive. We process about 7 million applications 
for student financial aid through our Information Services Division, and we 
provide large scale distributed OMR data entry systems to many agencies of the 
federal government, including the Air Force, postal Service, PAA, OPM, and the 
Department of Education for "in-house" scanning of CMR forms. NCS currently 
has over 1,000 systems installed in various federal installations. 

Finally, NCS has transported its OMR technology to foreign markets. In 
the international sector, NCS iias again served the complex data collection 
needs of education, government and industry. It was in the international 



193 



190 



3 



arena that NCS experienced its first success in supporting census data 
collection. NCS OMR systems have been used to conduct the population censuses 
of Venezuela, Chile, and the Dominican Republic, and the Mexican agricultural 
census. The experience gained in supporting international census data 
collection has served to both educate NCS to the special requirements of 
census processing as well as to clearly identify the major benefits which can 
be derived from using OMR systems for Census data capture. 

During the period of time when NCS was supporting foreign census programs, 
it also engaged in discussions with the U.S. Census Bureau regarding the 
possible role OMR could play in processing a U.S. decennial census. Between 
1980 and 1984 several delegations of census officials travelled to NCS 
facilities to view NCS OMR systems and talk to HCS engineers regarding the 
application of OMR technology in the 1990 Census. During this same period, 
NCS personnel toured various Census facilities in order to gain a more 
detailed understanding of U.S. Census requirements for 1990 and to understand 
more fully the Bureau's resident data capture technology, POSDIC. 

In 1984, the Bureau acquired an NCS OMR system, the W-201 for testing and 
evaluation. This model had been used successfully in several international 
census processing projects. • 

In making the commitment to test OMR, we believe the Bureau recognized two 
major benefits which would be derived from OMR processing. Pirot, OMR systems 




191 



4 



provide direct source document data capture. This means that information is 
extracted directly from the paper form as opposed to indirectly, which occurs 
under FOSDTC's filming, developing and fi'JN scanning process. Using a direct 
source document data capture system, there is rcluced clerical handling of 
forms which translates into a reduction in processing turnaround time and 
therefore, a reduction in cost. The second advantage of OMR is that the 
technology can be deployed, if so desired, in a highly decentralized fashion. 
In as much as the Bureau was interested in converting from the centralized 
batch processing mode which it employed in the 1980 Census to a "flow based" 
or "real time* processing scenario for the 1990 Census, it was critical that 
the data capture technology be able to be distributed close to subsets of the 
respondent population. By processing in relatively close proximity to 
respondent households, the Bureau would be able to accomplish its goal of 
concurrent processing and thereby more efficiently identify non-respondents 
and correct forms which failed Various content edits. The end result would be 
more complete coverage and cleaner data. 

After preliminary testing of the NCS scanner, the Bureau decided to use 
the OMR system to process the "short" form In the 1985 Tampa, Plorida, 
pretest. Between August 1984 and Pebruary 1985, NCS worked with the Census 
Bureau to prepare foi: the Tampa OMR 1 test. NCS assistance included advice and 
guidance relative to software development, field engineering adjustments to 
the scanning system, and support in forms design and printing. 




.192 



5 

With regard to fformB design, the Bureau originally asked NCS (using the 
model of the 1980 Census short form) to do a design of a scannable short form 
for the 1985 pretest. In responding to the Bureau's request, NCS initially 
designed the form as a 4 page booklet. In order to process the document on 
the NCS scanner, the 4 page booklet would be Blit into two 8, 5 "all* sheets for 
scanning. This ie a standard booklet processing procedure used by all major 
processors of OMR scannable documents. As an example, a major commercial user 
of NCS systems processes over 11,000,000 24-page booklets per year using this 
methodology. The Bureau stated, however, that this process was not consistent 
with their standard procedures. NCS originally proposed the 4 page document 
because it believed that, given the number of response items, it vould be 
easier for respondents to complete a form that spread data items across four 
pages. In deciding not to slit the booklet, the Bureau required that, all data 
items be compressed to a single two-sided 8.5*xll* sheet, while this was 
accomplished for the 1985 preteBt, the actual document waB rather dense when 
viewed from the respondents perspective. NCS and the Bureau agreed that the 
OMR form, if it was to be constrained to a single sheet, would have to be 
larger. 

Another issue which needed to be addressed in the 1985 pretest was the use 
of No. 2 pencils to complete the Census forms. Commercially available OMR 
scanners have been designed to read marks maie by No. 2 pencils. The 
historical basis for this requirement can be found in the fact that OMR 
systems were first used in educational testing environments where students 



196 



193 



were presumed to have pencils or could be provided with pencils at a test 
administration site. Other users of OMR systems beyond the educational sector 
never saw the use of a pencil as a limiting factor. As a consequence, it has 
never been a design requirement to equip NCS scanners with the ability to read 
marking instruments other than pencils. It is interesting to note, that where 
OMR has been used in international census taking, the actual censuses were 
100* enumerator administered, within such a controlled environment it was 
practical to supply enumerators with pencils. For the 1985 Tampa pretest, the 
Census Bureau decided to provide pencils to the 125,000+ respondents as part 
of the mailing packet. It was clear, however, that if OMR were to be used as 
the primary data capture technology for the 1990 Census it would not be 
realistic to provide pencils to 100,000,000 households. Therefore, the 
scanner would have to be enhanced so that it would read other types of marking 
Instruments. 

As NCS continued to work with the Bureau in preparation for the 1985 
pretest, we began to see that, for OMR to provide optimum performance and 
benefit within the D.S. Census environment, certain enhancements/modifications 
would need to be made to the NCS standard commercially available product. 
Accordingly, in January 1585, NCS submitted an unsolicited proposal to the 
Census Bureau to provide an OMR system which wou*d meet the specific and 
unique requirements of Census processing. In this proposal we presented a 
comprehensive plan for engineering the modifications which would be required 
in order to meet the Census Bureau's needs. Major 



194 



7 



tasks included the modification of the scanner transport in order to read a 
larger form (ll'xl7* as opposed to 8.5'xllM? enhancement of the optics system 
in order to reliably read marking instruments other than a No. 2 pencil (e.g., 
ball point pen, flair tip marker, etc.); and the development of an internal 
humidity compensation and skew detection and correction capability which would 
ensure the accurate reading of foirns regardless of changes in the forms 1 
dimensions resulting from variable environmental conditions. 

NCS proposed to deliver this enhanced scanning system in two phases. The 
initial enhanced model would have been delivered in January 1986 in time for 
use in the 1986 Los Angeles County pretest. This system would have had the 
capability to read the large form. In Sepfcember 1986, NCS would have 
delivered all other system features including variable marking instruments, 
humidity compensation and skew detection. At that time, the Bureau would have 
been able to complete its testing and evaluation of OMR in sufficient time to 
meet its then operative deadline for making the decision on the primary 
technology to be used for data capture in the 1990 census. 

Between January 1985 and May 1985 the Bureau considered the NCS proposal. 
Several meetings occurred between NCS and Bureau personnel. Also, duting this 
period (March 1985 - June 1985), tWe Bureau was evaluating the performance of 
the commercially available NCS scanner which was being used to process the 
19 B5 Tampa pretest. Prom the NCS perspective it watf essential that the OMR 
system employed in the Tampa pretest perform successfully in order fo;' the 




195 



8 

bureau to be willing to undertake further testing of an enhanced OMR system. 
During the Tampa pretest, the OMR system performed very well, considering that 
it was a "standard off-the-shelf product operating in a highly specialized 
environment. In essence, the pretest proved that OMR as a generic technology 
could successfully support decennial data capture processing. The pretest 
also confirmed, to no one's surprise, that the standard commercially available 
product would need to bt modified in order to meet Census requirements. 
Indeed, the OMR evaluation report prepared by the Bureau's Technial Services 
Division validated the need for many, if not most, of the enhancements NCS had 
proposed to make to its standard product. 

The Bureau's key decision makers agreed that the results of the OMR test 
in Tampa justified the need for further testing of an enhanced OMR system. 
Accordingly, on June 21, 1985, the Bureau announ~£d in the Commerce Business 
Daily its intention to contract with NCS for the delivery of an enhanced OMR 
system to be used in the 1986 Los Angeles County pretest. In this CBD 
announcement, the Bureau requested that other concerns having the capability 
to provide the required OMR system respond within 30 days. In August 1985, 
NCS was informed that the Bureau had received responses from other interested 
parties and that, on the basis of these responses, it would not proceed with 
an award of a contract to NCS. The Bureau stated, however, that it would be 
issuing a Request for PropoFal for the desired OMR system. In September 1985, 
a draft specification was released by the Bureau for comment. NCS submitted 
written comments to the Bureau in anticipation of the prompt release of the 



199: 



196 



9 



formal RFP. In early November 1985, NCS was informed by the Bureau that it 
had decided to cancel the OMR procurement and to discontinue any further 
testing or evaluation of OMR for use in the 1990 Census, 

The Bureau provided 3 principal reasons for cancelling the OMR evaluation 
program. Pirst, the Bureau believed that when comparing OMR and the Bureau's 
PACT 90 SyBtem from a cost/benefit perspective, OMR did not appear, on the 
basis of the Bureau's assumptions, to provide either sufficient cost savings 
or funtional advantage to justify further testing. Second, the Bureau 
believed that it had run out of time for making its technology decisions for 
the 1990 Census, Thus, the original schedule which called for a technology 
decision to be made in the fall of 1986 was too late. In order for sufficient 
PACT 90 systems to be ready for the 1990 Census, a decision relative to its 
use in the 1990 Census coulJ not be postponed until the fail of 1986. Given 
that the Bureau felt OMR still required additional testing to confirm its 
viability for decennial census processing, and given further that the PACT 
system had been used for decennial data capture since 1960, the Bureau claimed 
that there would be no advantage to be gained from continuing the OMR testing 
program. Lastly, the Bureai believed that the enhanced OMR system which they 
would have received for testing in 1986, would not necessarily have proven the 
ultimate utility of OMR for decennial Census production processing. 

NCS has taken exception to the reasons given for cancellation of the 
technology evaluation program, with regard to a cost/henefit comparison of 




197 



10 



OMR and POSDIC, NCS continues to believe that the direct source document data 
capture provided by OMR and the ability to distribute OHR technology widely in 



processing and accelerated document turnaround than does POSDIC. To be sure, 
OMR doeo not currently provide certain features available through POSDIC. OMR 
processes multi page booklets (e.g., long form) by slitting the booklet, 
scanning the sheets and, if desired, automatically reassemble the booklet 
after the scanning process. Since POSDIC forme must first be microfilmed 
before scanning, POSDIC produces an archieval record of the Census form as a 
by-product of that microfilming process. Whereas under OMR processing either 
a microfilm capability would have to be added to the OMR system or 
microfilming would have to be done as a separate and possibly later process. 
On the other hand, OMR processes paper directly, whereas POSDIC data capture 
requires a sequential process of microfilming, microfilm developing, and 
microfilm scanning. If a Census form fails edit under the OMR processing 
scenario, the correction can be made directly to the form and the paper 
directly rescanned. Under POSDIC, however, if a document fails edit, after 
the correction is made on the form, the entire filming, developing and 
scanning process must be repeated. 

With regard to the Bureau's need to accelerate its technology decision 
timetable, NCS can only state that it had been operating for over a year and a 
half under a schedule that assumed a technology decision would be made in the 
forth quarter of calendar year 1986. NCS felt that this schedule was 



the field, serves to more effectively promote the Bureau' 



s goals of concurrent 




198 



ii 

reasonable when viewed from a development, teoting and procurement peropective 
as well an gearing-up for production and deployment. The sudden change in 
schedule was very unexpected given the precision with which the Bureau had 
originally laid out its timetable and master plan. 

Pinally, the system that NCS would have provided for testing in the 1906 
Los Angeles pretest would have allowed the Bureau to evaluate all manditory 
features and components. The system would have been a pre-production model 
that would have permitted the Bureau to determine whether OMR could operate as 
the primary data capture technology for the 1990 census. 

Not withstanding the Bureau's decision to terminate the OMR technology 
evaluation program, NCS believes the Bureau conducted this program consistent 
with the highest standards of professionalism and with a genuine commitment to 
a thorough and fair evaluation. 

In stepping back, however, from the specifics of the Bureau's OMR 
technology evaluation program, there are several general observations that can 
be made. The Bureau has had a long and successful history as a pioneer in the 
use of advanced technology and innovative methods. Recently, however, the 
Bureau has had more difficulty maktng significant technological advancements. 

In planning and implementing a decennial Census, the Bureau is continually 
having to make decisions along a risk vs opportunity continuum. While the 



202 



199 



12 



staken arc 



high in conducting a decennial Cenouo, it hao been our observation 



that the Bureau tendo to place ito primary emphaois on riok minimization ao 
oppooed to taking advantage of the opportunities to implement major 
advancements in the use of technology for the decennial cenouo. We believe 
that part of the problem can be traced to the Bureau's planning process. The 
Bureau oeemo to wait too lor.g between decennial censuses before it begino to 
define its approach to technology evaluation. Consequently* no Cenouo Day 
approaches, the Bureau has been forced to fall back on procedures and syotems 
used in previous cenouses because sufficient time no longer remains to 
sucessfully teot, procure and implement innovative new automation systems. 
The Census Bureau seems to repeatedly put itself in a continuous cycle of 
developing plans to move forward with its decennial automation only to be 
forced to fall back on extant data processing systems, with some measure of 
Bureau developed enhancements. 

ThiP is a problem that has historical perspective in that the Bureau has 
consistently tended to rely heavily on the use of its in-house capabilities 
for Khe development of any new or innovative methods or systems. Early in the 
Bureau's history, this approach was necessary, based on the lack of 
commercially available alternatives. Today, however, the commercial sector 
provides a broad range of products *nd services which could be of direct 
benefit to the Census Bureau. 




200 



13 



Where should the Bureau be going for the 2000 Decennial Cenouo? It io 
very clear to NCS that plann, lg for the year 2000 should begin today. The 
Bureau should undertake an immediate research and development effort. This 
RfcD program ohould concentrate on developing a qualitatively ouperior data 
capture system. Such a system must combine the best features of currently 
available technologies, including POSDIC, OMR, OCR, Image capture, optical 
disc technology, on-line microfilming and multi-page booklet processing. 
Additionally, the design philosophy of such a program must be flexible enough 
to accommodate applicable emerging technologies* The goal of this R&D program 
should be to develop a Census data capture system which would exceed all that 
is currently available when viewed from the perspective of cost effectiveness, 
efficiency and overall function. We believe very strongly that this project 
must be a collaborative effort wherein the best resources of the Bureau would 
work closely with experts drawn from external organizations. 

Additionally, the Bureau should continue examining and analyzing all 
emerging technologies, with specific attention directed to those technologies 
which could ultimately provide a truly electronic data capture capability. 

Finally, we wis u *j reemphasize the importance of beginning this project 
immediately. If the Census Bureau 'begins a research and development program 
this year, it is possible to have a prototype system ready for live testing 
during the 1990 Census. For this goal to be realized, however, the Bureau 
must make the R&D program a high priority. 




201 



National Computer Systems responses to the Questions submitted by the 
Committee on Post Office and Civil Service, Subcommittee on Census and 
Population . 

Question no. 1: As you know, the Census Bureau provides a back-up system to 
the birth and death records of the country. If they are 
ever lost or damaged, people can obtain vital documentary 
evidence by consulting a microfilm copy of their old census 
records, if the Census Bureau decided to switch to the OMR 
technology, how would they continue to serve this need? 

NCS Response: During the period of time that NCS was working with the 

Bureau on the OMR technology evaluation, NCS had several 
discussions with Bureau personnel relative to Census 
archival record requirements. NCS understands fully the 
Bureau's need for a microfilm copy of the original census 
form. NCS has defined three possible alternative 
approaches to satisfying this reauirement. Probably the 
most efficient and cost-effective approach would be to 
eauip the OMR scanner with an in-line microfilming 
capability. NCS engineers have assessed the feasibility of 
mounting a microfilm camera within the OMR transport 
housing. Based on this assessment, we believe it is 
totally within the realm of possibility to equip an OMR 
system with a microfilming capability, it is assumed that 
document microfilming would occur as a continuous process 
following OMR data capture. However, instead of developing 
the microfilm within the processing office, since the 



202 



microfilm copy is not essential to the data capture process 
itself, it is assumed that the microfilm cartridge would be 
shipped to the Jef f ersonville, IN, Data Preparation 
facility for subsequent developing. The second alternative 
would be to have a separate microfilm camera at the 
processing office in order to create the archival record of 
the Census form. Under this scenario, after source 
documents were scanned on tlie OMR system, they would be 
passed on to a microfilming operation. Again, the raw film 
would be shipped to Jeff ersonvil le for developing. 
Finally, the last alternative would be to ship the Census 
forms themselves, to Jeffersonville for microfilming after 
OMR data capture was completed in the processing office. 
We consider this last alternative least desirable based on 
the cost entailed in shipping all Census forms to 
Jef f ersonvil le. 

Question No. 2: In your statement, you say that each year NCS processes 130 
million OMR documents. Just to place the Census in 
perspective, the decennial involves more than 106 million 
housing units, each of which will have a form. 
Furthermore, 85% of these forms must be processed in less 
than 10 weeks — a very short period of time. It looks 
like the job of processing the Census with OMR could easily 
require equipment that involved several times the capacity 
of your firm. I have two questions about this: 

First, would it be possible to obtain this amount of OMR 
equipment? 



206 



203 



Second, what would the government do with the equipment 
when we are finished with it? Could it be leased so that 
we would not have to absorb the full cost of buying it? 

NCS Response: A. Yes. NCS's manufacturing capabilities are more than 

adequate to meet the OMR scanning needs of the Decennial 
Census. Bad the Bureau stuck to its original timetable for 
technology evaluation and procurement, the requisite number 
of scanning systems would have been ready for the 1988 
dreaa rehearsal and the 1990 Census. 

B. NCS has had several discussions with the Census Bureau 
regarding alternative procurement approaches for the 
required quantity of OMR scanning systems, and was prepared 
to address that issue in contract negotiations. We have 
always assumed that a combination of a lease and purchase 
arrangement would be the approach best suited to the Census 
Bureau. Under this approach, the Census Bureau would 
purchase a certain number of OMR systems to be retained as 
residual technology after the 1990 Census. The remaining 
systems required for decennial Census processing would be 
leased to the Bureau for the duration of decennial data 
capture. NCS believes that there would be an international 
market for the leased systems among countries conducting 
their censuses between 1993 and 2000. 



207 



204 



Question No, 3: The Census Bureau mentioned in their testimony that one of 
the reasons for not going for the OMR technology is the 
issue of cost related to adopting the OMR technology to the 
decennial processing. How much do you estimate the cost 
would De or if you cannou estimate the cost, what are the 
cost factors to be considered? 

NCS Response: In order to provide a definite cost estimate, we would need 

a set of comprehensive and stable requirements from the 
Census Bureau. One of our major frustrations has been our 
inability to acquire a firm fix on what the Bureau's 
requirements would be. However, the principal cost factors 
for which we would need a statement of Bureau requirements 
are as follows: 

o Number of processing centers 

o Number of 100% and sample forms to be processed per 
center. 

o If the Bureau would agree to document slitting and, 
therefore, OMR scanning of the sample form, what is 
the page length of the sample form? 

o What level of redundancy would the Bureau require at 
each processing center? 



208 



205 



What is the timeframe in which the Bureau wishes to 
process 100% and sample forms? 

How many labor shifts would be utilized at each 
processing center? 

How many machines would the Bureau intend to purchase, 
and how many would they wish to lease? 

What type of maintenance program would the Bureau wish 
to contract for? 



0033G 




206 



Mr. Garcia, Before we do that, what I would like to do is ask 
both Mr. Dodaro, Ms. Miskura, and Mr. Bounpane to come on hack. 

I would like the Bureau of the Census to respond, especially to 
the last statement made by Mr. Franke about the R&D, and how 
do we go about making that decision fast enough and quick enough 
for the 1990 census. 

Ms. Miskura. We certainly have learned, from our attempts to 
plan the automation of the 1990 census, the amount of leadtime 
that it takes to explore the technologies that are available, to un- 
derstand the possible technologies that will be available between 
the time our planning starts and when we actually have to take 
the census. 

Those lessons certainly have told us that we do need to start re- 
search and development of data capture technology and other auto- 
mation features of the year 2000 census as soon as possible. 

It will be a Bureau priority to start defining the year 2000 census 
and automation requirements as soon as possible. We would re- 
quest funding for that and we will use the opportunity of the 1990 
census for research and experimentation with possible methodolo- 
gies and technologies. 

We have long thought about the ideal Census being a paperless 
Census based on electronic means of having data available on the 
computer and we do intend to work towards that end. 

Mr. Garcia. Mr. Dodaro. 

Mr. Dodaro. We certainly would support moving out in the di- 
rection that has been suggested. And in terms of planning for the 
year 2000, we have continually advocated the need to make maxi- 
mum use of leadtime and think that is an opportunity. 

The only caveat I would add to that is that it still may not be too 
late to include certain things in the 1990 census. And I would not 
want to see us move out with the idea of writing off the 1990 
census and moving just totally, exclusively into planning for 2000. 

I think the timing that Mr. Franke suggested about using the 
1990 census as a test, is a good idea. 

Mr. Garcia. Why can't we do that? If we can't do it nationally, 
why can't we do it regionally? Why can't we just try? 

Ms. Miskura. To the extent that we identify potentially useful 
technologies and methodologies we will have the opportunity in the 
1990 census to try them, perhaps, for — within a processing office or 
within a set of collection offices. 

To the extent that can learn about the technologies and to the 
extent that we feel that it is appropriate to try to learn about them 
in the context of the decennial census our plans are to do that. 

Mr. Garcia. I was flown out to Iowa City to look at the facility in 
1979, just prior to the 1980 census. Now 1986 — and we are still 
talking. 

I have all the respect in the world for the agency, but — I guess 
some decisions are not being made when they should really be 
made. At least we should try — so we try and we lose — we will try 
something else; the idea is to really try and be ahead. 



210 



207 



I don't think you should be just on a par. I think the Bureau of 
the Census should be so far out in front that everybody is looking 
to them for the leadership. I mean that. I say that with all due re- 
spect to the agency. 

Again, the Census Bureau—stated in their testimony vhy they 
chose not to use OMR technology for the 1990 decennial One of 
their reasons is that the scanners are sensitive to temperature and 
humidity. AnGther reason is that they experienced difficulties in 
evaluating the scanner accuracy. 

Any comments? 

Ms. Miskuka. In our defining the parameters of our interest in 
the optical mark recognition technology early in our planning proc- 
ess, and our funding for planning the 1990 census officially began 
in fiscal year 1983, what we had hoped to do was to apply a princi- 
ple that I think has been talked about this morning, which is to 
bring data capture as close as possible to the source of data that we 
could. 

The idea, therefore, was to try to develop— or see whether OMR 
could be developed to decentralize as far as the collection office, 
400 to 500 different locations. We had to make sure that the ma- 
chine was fairly robust, so to speak, because in identifying 400 to 
500 different locations for collection offices across the country we 
might not be able to get ideal space all of the time. 

Our looking at OMR in Tampa was really the first step in what 
we envisioned as an ongoing process. The first thing we wanted to 
know was would we be able to function with a minimum of envi- 
ronmental controls. We did feel that we got the answer to that 
question, at least in terms of the existing technology, that we 
would not be able to do that in 400 to 500 different places. 

Mr. Franke. Mr. Chairman, may I respond to that? 

Mr. Garcia. Yes, please. 

Mr. Franke. I was not at the Jeffersonville pretest— my two as- 
sociates were, and I would like them to comment in a bit more 
detail. 

But I would submit that the problems experienced were not nec- 
essarily problems associated with the scanning system, but rather 
problems associated with the form. As with a lot of data processing 
media, including magnetic tapes and discs that are transported any 
distance, that media must be acclimatized before it is inserted into 
a data processing system. I believe that during the Jeffersonville 
pretest we were transporting documents from Tampa up to Indiana 
in March. So the documents underwent a rather severe tempera- 
ture change and humidity change. 

We suggested at acclimatization, I believe, in fact, the Bureau 
did acclimatize, but I would suggest that it may have moved the 
form than the hardware itself. And I think that too affected, per- 
haps, the accuracy of the system. 

If either one of you would like to comment on that. 

Mr. Goldberg. Mr. Chairman, the only additional comment I 
would like to make 

Mr. Garcia. For the record — your name? 

Mr. Goldberg. Mr. Goldberg. Jeffrey Goldberg. 

The only additional comment I would like to make with regard 
to the environmental issues was that it is true, we did recommend 




208 



to the Bureau in the 1985 pretest that they acclimatize the forms. 
It is also true that we recognize that in a full-featured decennial 
census the possibility of conducting a 24-hour acclimatization of 
forms was not realistic or feasible. 

And that is why in our proposal to the Bureau to deliver an en- 
hanced system we proposed to develop, if you will, an internal hu- 
midity compensation capability in the scanner, that would basically 
allow the scanner to dynamically adjust as the form was passed 
through the scanner for changes in the size of the form. We were 
familiar with that requirement, sensitive to it and proposed to ad- 
dress it in our enhanced system. 

Ms. Fernandez. Peter, it appears that probably the main reason 
why the Bureau has not moved to continue following up on the 
OMR improvements and modifications is the constraints of time. 
It seems that that is a perennial problem. As you know, on April 
18 of last year we had the Census Bureau, the General Accounting 
Office, and Mr. Funk, inspector general of the Commerce Depart- 
ment, testify. And consistent with GAO's testimony today was Mr. 
Funk's testimony that unless there is a long term planning func- 
tion at the Bureau, this is going to be a cyclical problem and that 
we can have the same expectations in the year 2000. 

What is the Bureau planning to do to take it out of that cycle of 
planning for the immediate decennial, which brings to bare time 
pressures, time constraints, money constraints? When does the 
Bureau plan to move to planning for the nert decennial, that does 
not have those time pressures and would allow you to explore 
modified versions, do demonstrations in various areas? 

Mr. Bounpane. I think you have certainly put your finger on a 
very difficult problem, which is that there are limited resources in 
any agency, and I am not just talking about money, I am talking 
about people, as well, and when you get up against the crunch of 
doing the census it is very hard to think about tomorrow. 

One way we are trying to solve that is through a current pro- 
posed reorganization which would set up some positions outside — 
not completely outside — but somewhat apart from the mainstream 
Census operations, and to put the development of the research and 
evaluation program for the year 2000 in this new arm. That should 
have some advantage— it will still be internal conflict, of course, 
for the people to work on it, but it will give an opportunity for 
some people to be thinking about the future and not be burdened 
with the day-to-day activities of making sure that 1990 works right. 
And I hope that change will help a little bit in this problem or we 
will be in the same position again. 

Ms. Fernandez. If I recall, the inspector general also suggested 
that maybe the Census Bureau should look to outside experts to 
give them this technical advice. I think one of the common themes 
of both the testimony of the General Accounting Office and NCS is 
that as with all bureaucracies, we tend to hold on to what we 
have— it is risk-free and it has been demonstrated on a large scale 
decennial and that we don't want to shake the boat. There is also 
self-interest in terms of the manufacturing of the FACT 90 ma- 
chine because it is manufactured in house and there may be some 
resistance there towards other automation systems that may be 
manufactured outside. 



212 



209 

How will that be dealt with? I view that also as a particular 
problem that may not be solved by the reorganization that you iust 
mentioned? J 

Mr. Bounpane. Yes. That, again, is a good point. Let me tell you 
what we do with regard to the American Statistical Association 
and perhaps something parallel to that should be done with the set 
or private sector vendors. 

In developing the research and evaluation program for the 1980 
census, we did ask our statistical colleagues to come in and make 
suggestions to us about things we should test in 1980 toward 1990 
But it was in the limited arena of how best to take the census and 
statistical aspects of it. 

Perhaps we need to do something parallel with the private 
sector, which says tell us what we should test in 1990 towards the 
year 2000 with regard to equipment available to us to take and 
process the census. The same kind of advisory approach with a 
panel like that might help in developing our year 2000 plans and I 
trunk that is something we will pursue. 

Mr. Franke. Mr. Chairman, may I take off my NCS hat for iust 
a moment to speak as a member of this industry. We have enioved 
our association with the Bureau of the Census. They have some 
challenges that we in private industry do not get to see very often, 
especially in the area that we are involved in. 

I think that you would find that by going to the private sector, 
i'eter, you would get a rousing response. We are out there, we are 
very interested, we have some good technology and we have some 
good people. I think it is a function of how that relationship is 
structured and where we are going to go with it that is going to 
dictate the degree with which the private sector is willing to 
commit to it. And I think if we can together— if you can set down 
some goals and invite us in to comment, to be involved in it— and I 
am certain I speak for most of the companies that are in this envi- 
ronment—we would welcome that opportunity. 
Mr. Garcia. Mr. Dodaro. 

Mr. Dodaro. Mr. Chairman, we would also support that. When 
Mr. Bounpane mentioned earlier, too, the strain on resources and 
constraints within the. Department, I think it is only logical that a 
well-articulated planning strategy for where they want to go with 
the year ^000 ought to involve some active continuing advice from 
the private sector. 

Mr. Garcia. If I may make a suggestion to the three organiza- 
tions who are represented here today. I wholeheartedly agree, and 
this subcommittee would be only too happy to be the fourth part of 
that, in tact, there can be congressional input if some obstacles do 
arise and we are in a position to help. I think that that is the pur- 
pose of these hearings— 

I think one of the other long term gripes I have with the Bureau 
or the Census is not any person at the Bureau, but because the pol- 
itics is always there and there is a new director for each decennial. 
1 would like to just take one director and keep him there forever. 
ILaughter.J 

I mean that, because those decisions come from the top and who- 
? Ve \Jl e ?\. she may - be ~ we shouldn't change the director constant- 
ly. What happens is the middle management of the census, which 



213 



210 



is there all the time become so accustomed to seeing directors 
coming and going that they question when is this person going to 
leave. I just don't think that speaks well for an agency, especially 
for one that disseminates information which is so absolutely vital 
for the health and the future of this country. 

I think that one of the things that I am going to push for is to 
have a person at the head of the Bureau of the Census who is free 
of that political consideration, Democrat or Republican. They can 
just stay there, as a professional, and run the operation so that the 
people below them know that there is somebody there who will 
make a decision; and that that decisionmaker will be there in 2, or 
3 years, or 5 years, or 10 years — however long they may decide that 
that is what they want to do. And I really feel strongly about that. 

Having said that, I would like to, again, thank the three organi- 
zations who are present. I think this is what this subcommittee's 
aim is — to bring you together in honest, open and frank dialog. If 
we can work together without concern that we are looking to be 
critical of one another, or each other, and that we are really work- 
ing in a good, solid framework I think we can have a good census 
in 1990 and we can have a Bureau of the Census that will be able 
to provide this country with quality data. 

Thank you very much, I really appreciate your presentation. I 
know how busy all of you are. I have some questions that counsel 
will be submitting to the three of you, and I would appreciate it 
very much if you would look over the questions. As we have done 
in the past, if you can get those answers back to us, we can complete 
this record that we have established here today. 

Thank you very much. 

[Whereupon, at 12:15 p.m., the subcommittee recessed to recon- 
vene at the call of the Chair.] 

O 



61-902 C216) 

ERIC 214 



